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OBJECTIVE CONFIGURAL RULES FOR DISCRIMINATING 
PSYCHOTIC FROM NEUROTIC MMPI PROFILES’ 


PAUL E. MEEHL 


University of Minnesota 


One contribution which psychologists are 
expected to make toward the clinical assess- 
ment of psychiatric patients is helping deter- 
mine the presence or degree of psychotic 
tendencies. This problem presents itself, for 
reasons of varying pragmatic import, such as: 
What formal (nosological) diagnosis should 
be given to the patient? To what extent is 
there a danger of serious acting out if this 
patient is seen on an outpatient basis? Is 
there a psychotic process or structure behind 
the superficially neurotic manifestations, such 
that methods of treatment appropriate for 
neurotic patients are likely to be inefficacious 
or even deleterious? What is the long-term 
prognosis, requiring consideration in voca- 
tional and educational guidance, advice to 
relatives, recommendation to a rating board, 
communications to a family physician, social 
agency, or court? Should a conservative staff 
feel entitled to utilize the 
of treatment, e.g 


more radical kinds 


electroshock, 


gressive 


when their policy is to avoid such procedures 


in the treatment of psychoneuroses? Is this 
the sort of patient who should probably be 
treated by Dr. X, who is especially gifted in 
the treatment of mild schizophrenic condi- 
tions and, unlike other available 
therapists, has a preference for working with 
them? We shall not discuss the utility in clini- 
cal decision-making of answering these ques- 
tions. Suffice it to say that the assessment of 
“psychotic tendency,”’ phrased in one form or 
another, is one of the tasks with which the 
clinical psychologist in most installations will 
at times be confronted. 

The MMPI is among the tests utilized for 
this purpose. If patients could be effectively 


our two 


1 This study rted by 
Research from the University of 
ate School 


was supp a Grant-in-Aid of 


Minnesota Gradu 


W. GRANT DAHLSTROM 


University of North Carolina 


sorted into nosological categories simply by 
identifying their highest MMPI score, each 
of the rubrics in the psychiatric nomenclature 
having a one-to-one correspondence with the 
MMPI variables as named, the present in- 
vestigation would be pointless. It is, however, 
well known that this simple procedure does 
not work. This is one reason why clinicians 
prefer to characterize profiles by code rather 
than by the original scale names. In the Uni- 
versity of Minnesota Hospitals the mimeo’d 
profile form used in patients’ charts has only 
code designation for scales. (See Hathaway, 
1947; Hathaway & Meehl, 195la, 1951b; 
Meehl, 1950b; Welsh & Dahlstrom, 1956.) 
It is true that one can do significantly better 
than chance by paying attention only to the 
highest one or two 7 scores (Hathaway & 
Meehl, 1951a; Meehl, 1959a) but the amount 
of improvement over chance, while it testifies 
to the presence of some validity in the in- 
strument, is not great enough to be very use- 
ful. Clinicians even moderately familiar with 
the MMPI have, therefore, been accustomed 
for over a decade to interpreting the results 
by paying attention to the profile pattern in 
some kind of joint relation to the overall 
elevation and especially the elevation of the 
most deviant scores. 

While it would not be surprising to find 
that the “clinical eye” had trained itself to 
recognize configurations not readily identified 
by conventional linear methods of statistical 
analysis (Horst, 1954; Lubin & Osburn, 
1957), it might be presumed that the cli- 
nician’s subjective judgment, however experi- 
enced, assigns less than optimal weights. In 
addition to this systematic bias, the human 
judge inevitably throws in some more or less 
random error variance due to his unreliability. 
In trusting the clinical eye (at that stage 
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of the total decision-making process which 
is concerned to classify the profile, although 
not necessarily the patient!) we treat the 
nonoptimality of the clinician’s unverbalized 
function, and the temporal instability in its 
application, as a price paid in order to get 
the advantages of a configural approach to 
the profile pattern. Configurality tends to 
make the clinician’s global judgment effec- 
tive; nonoptimal weights and temporal fluc- 
tuation tend to make it ineffective. The net 
efficiency of the clinician’s “judgmental” clas- 
sification or ordering of profiles is the out- 
come of these conflicting forces (Meehl, 
1959a). For this reason, the present investi- 
gation has, in addition to its technological 
aim of providing an aid to MMPI users, an 
intrinsic methodological interest. To what ex- 
tent can the complex configural generaliza- 
tions exhibited by the behavior of MMPI- 
skilled clinicians be frozen into a set of cleri- 
cal operations? If this could be done, it would 
reduce the amount of clinical experience re- 
quired by the test user; more importantly, it 
is probable on theoretical grounds that if the 
essential features being reacted to by the cli- 
nician can be dealt with in an actuarial way, 
even the asymptote of correct decisions will 
be increased (Estes, 1957, p. 615). 

More important in the long run than either 
the immediately pragmatic or methodological 
interests, however, is the possible construct 
validity (APA Test Standards Committee, 
1954; Cronbach & Meehl, 1955) of formal 
criteria for the “psychotic profile.” Psycho- 
metric devices should ultimately reach a point 
of development comparable to the laboratory 
technics of internal medicine (biochemical 
tests, biopsy, roentgenology) such that they 
are thought of technologically as on the same 
level with the clinical interview, or ward rat- 
ings, rather than as clever devices for predict- 
ing these latter “criteria” (Meehl, 1959b). 
When a patient responds to the verbal stimuli 
which constitute the MMPI pool in a manner 
characteristic of previously studied patients 
recognized by the familiar clinical criteria as 
schizophrenic, but the clinical staff are not 
inclined to so diagnose him because of the 
absence of the traditionally emphasized—and, 
from Bleuler’s point of view, mainly second- 
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ary—‘‘frankly schizophrenic” symptoms, it is 
debatable which of these facts about the pa- 
tient’s behavior should be given the greater 
weight as a probabilistic indicator of his in- 
ternal psychological state, structure, and dis- 
positions. The development of an objective set 
of profile pattern criteria for the identifica- 
tion of recognized, diagnosed manic-depressive 
and schizophrenic psychotics has numerous 
possibilities with regard to subsequent lines 
of investigation directed at achieving a ‘“‘boot- 
straps effect” (Cronbach & Meehl, 1955, p. 
286). We inject this methodological note to 
provide adequate motivation for the contami- 
nated derivation procedures employed 

In what follows, the terms “hit” and “miss” 
will be employed in the sense of concurrent 
validity, except where specifically mentioned 
otherwise. Hints as to construct validity ap- 
pear in certain of the cross-validation sam- 
ples, but our main purpose is to present the 
available concurrent validity data for the 
reader to interpret and utilize however he 
sees fit within his own pragmatic and theo- 
retical framework. To avoid the tedious repe- 
tition of quotation marks, the words “miss” 
and “hit” will hereafter be employed without 
them. 


DERIVATION OF THE CONFIGURAL RULES 


The ultimate 
procedure is the cross-validative success of 
the final product. We will indicate only briefly 
the general working assumptions and pro- 
visional hypotheses which underlay our deri- 
vation methods. A guiding assumption which 
would not be admitted by all workers in psy- 
chopathology was that there exists some de- 


istification for a searching 


gree of objective typology or taxonomy in a 
psychiatric population, which will be reflected 
in the occurrence of profile groups exhibiting 
a tendency to a kind of “psychometric dis- 
continuity.” We did 


not anticipate that a 
the MMPI variables 
which would be of man- 
ageable complexity and stable parameters and 
vet do justice to the configural effects. For 
example, it seemed doubtful that the weights 
optimal for making the psychotic-neurotic 
discrimination within profiles exhibiting a 27 


function of 
could be invented 


continuous 
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code (commonly found both in psychotic de- 
pressions and in neurotic depressive reactions 
or anxiety states) would be very close to the 
weights optimal for distinguishing between 
psychotic and neurotic patients exhibiting less 
self-concern and subjective discomfort and 
who handle their anxiety via somatizing or 
projecting mechanisms and present profiles 
peaked at 3, 4, or 6. So a mixed summative 
and successive-hurdle model was used through- 
out in preference to a pure summative model, 
continuous variables and difference scores be- 
ing distributed separately for analysis within 
relatively more homogeneous groups initially 
set apart on the basis of crude but configural 
criteria such as the Hathaway code. We fur- 
ther assume that 
rent 


because of defects in cur- 
(which must, of 
course, be distinguished methodologically from 
unreliability in the clinical application of in- 
trinsically powerful diagnostic concepts), or 
failure of the MMPI scores to provide dis- 
criminating information, profile configurations 
occur which are genuinely without differential 
significance. Therefore three classification out- 
comes were allowed: A curve would, by the 
application of the rules, be classified as either 
“Psychotic,” “Neurotic,” or “Indeterminate” 
in form. A third working assumption was that 
the formal 


nosological concepts 


diagnosis, while it is a readily 
available crude criterion by means of which 


patterns are initially identified, is far 


from 
infallible, so that we are justified in assum- 
ing that sometimes the test is right and the 
official diagnosis is wrong. Consequently, one 
must avoid the temptation to 
many ad hoc the effort to capture 
every test miss. With this in mind the non- 
psychometric case data were deliberately al- 
lowed to influence our MMPI rule-making at 
all stages prior to final cross-validation. At- 
tempts to capture a miss by devising (or re- 


formulate too 
rules in 


vising) a rule were not pursued in those in- 
stances in which critical reading of the case 
material justified serious doubt as to whether 
the criterion in this ca 
ter than the MMPI 


was employed as a 


se was performing bet- 
rhe skilled clinical eye 
searcher and idea-origi- 
nator; statistical runs were employed both as 
searchers and as checks upon the deliverances 
of the clinical eye. A fourth working assump- 


tion was that only the most obvious test- 
related theoretical or dynamic assumptions 
should play any important part in accepting 
a rule. We take it for granted that present 
knowledge, either of personality dynamics or 
test-taking behavior, is rarely sufficient to 
justify the neglect of a statistical finding; so 
that if a certain pattern “works” it is the task 
of theory, now or in the future, to explain 
why it does. 

On the basis of a preliminary run on the 
MMPI profiles of 41 male neurotics and 39 
male psychotics who had been hospitalized in 
the inpatient service of the University of Min- 
nesota Hospitals prior to 1946, we chose two 
difference scores. Pt and Hs were 
paired, both having originally been derived 
as “neurotic” scales but it being part of 
MMPI lore and general psychiatric experi- 
ence that the obsessional veers more toward 
the psychotic side than does the patient with 
a preference for somatic symptoms. Sc and D 
were also paired as a result of this prelimi- 
nary study. Both of these scales were origi- 
nally derived on psychotic patients but it is 
known that D, a scale, is often ele- 
vated in neuroses and is markedly so in many 
cases diagnosed anxiety-neurosis. Severe anx- 
iety states run a very high D, yet some of 
these cases are (as shown by subsequent 
course or atypical symptoms) actually schizo- 
phrenic. It seemed plausible that one might 
counter-balance these factors by pairing the 
D scale with the more psychotic Sc scale. A 
preliminary cross-validation on 99 additional 
cases (55 neurotic and 54 psychotic) resulted 
in approximately 67% 


sets of 


‘mood’ 


hits, so these two dif- 
ference scores were retained. 

We next plotted (Sc — D) against (Pt 
Hs) on an expanded sample of 104 psychotics 
and 128 neurotics; five diagonal “bands” 
were set up on this plot by drawing lines at 
45° so that the central band con- 
tained approximately equal numbers of psy- 
chotics and neurotics, the extreme bands a 
distinct preponderance of one group (75% 
psychotics in the upper right corner and 83% 
neurotics in the lower left) and the inter- 
mediate bands, two and four, showed a trend 
but with numerous misclassifications. The 
equations of these lines, with the bracketing 


located 
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of the four variables rearranged for easier 
computation, involve the quantity denoted 
Beta in the final rules, and the five regions on 
this graph correspond to the five bands co- 
ordinated to these Beta values. Beta = (Pt 
+ Sc) — (Hs + D). 

After a study of the cases missed an in- 
spection of several possible distributions of 
difference scores selected for nonoverlapping 
with the first two, two other pairs were 
chosen. (Pa — Hy) showed a good separation 
among a subset of cases not discriminable by 
the band method, and had several armchair 
features to recommend it. Neither of the com- 
ponent scales contains a suppressor correc- 
tion (McKinley, Hathaway, & Meehl, 1950; 
Meehl & Hathaway, 1946); both scales have 
numerous “subtle” items involving denial of 
pathology and the self-image of normality, 
rationality, and psychic health (Gough, 1954; 
Meehl, 1950a; Meehl & Hathaway, 1946; 
Seeman, 1952, 1953; Wiener, 1948, 1951); 
both are extrapunitive or impunitive rather 
than intropunitive, tending to manipulate the 
environment or to develop symptoms and 
traits involving less subjective discomfort and 
anxiety. Both are associated with an inability 
to “think psychologically,” marked lack of 
insight, and resentment of psychiatric ex- 
ploration. The (Pe — Hy) difference score 
might be thought of as located on a psychotic- 
neurotic axis different from that represented 
by (Pt — Hs). 

The second new difference score was (Pd 
— Hs). Here an indicator of the acting out. 
extrapunitive, “tough” component (as con- 
trasted with the more decompensated, suffer- 
ing and self-concerned individual given to 
elevations on Pt and Sc) is balanced off 
against a somatizing and often passive—ag- 
gressive neurotic element. These two scores 
undergo an approximately equal K correc- 
tion. The sum of the two new difference scores 
(regrouped for computational ease as the dif- 
ference of two sums) is designated “Delta” in 
the final rules. Delta = (Pd + Pa) — (Hs + 
Hy). We tried thus to counterbalance several 
difference scores so that the various forms of 
predominantly psychotic or neurotic mode of 
adaptation could have a chance to be ex- 
hibited in suitable contrast effects. These ad- 
mittedly loose but clinically plausible consid- 
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erations were, of course, in each case checked 
against empirical distributions. 

A criterion sample of 262 male patients, 
evenly divided between neurotics and psy- 
chotics, constituted the main basis for rule 
development. One hundred eighty-seven of 
these profiles came from the inpatient files of 
the University of Minnesota Psychopathic 
Unit; 49 cases were drawn from records avail- 
able at the VA Hospital at Fort Snelling, 
Minnesota; and 26 cases were drawn from 
the files of a Canadian mental hospital. Rec- 
ords with ? >60, L270, or F280 were 
excluded. An effort was made to exclude cases 
in which the MMPI was for some reason 
(such as catatonic untestability) not taken 
until the patient had recovered fron. an acute 
episode, but it is not always possible to de- 
termine this from the records. 

Preliminary to the statistical study of these 
profiles, one of us (PEM) went through the 
(randomized) set of profiles inspectionally 
and classified each as neurotic or psychotic. 
Several months later this procedure was re- 
peated. The results of this sorting will be re- 
ferred to as the “impressionistic profile clas- 
sification,” and it was part of the evidence 
used in deciding whether an attempt should 


be made to write a special rule, or modify a 


tentative rule, in 
test miss. 

Cases falling within each band (Beta re- 
gion) were distributed as to Delta values. 
While these distributions usually indicated 
respectable statistical validity, from the stand- 
point of clinical practice they showed misses 
felt to be avoidable on the basis of inspec- 
tion and previous clinical experience. Further- 
more, there appeared to be obvious “holes” 
and deviant “clumps,” suggesting the pres- 
ence of strong minority profile types which 
might be readily identifiable by features not 
reflected in Delta. Several searching aids were 
used, such as reshuffling of the profiles so that 
they would be laid out on the floor in a dif- 
ferent random order, or arranging them in 
different systematic orders from right to left 
(e.g., first by absolute elevation of the peak 
score, then by the first digit of the code, then 
in order of the Delta value, then in clusters 
based upon the formal diagnosis). 

Profiles falling wholly within the “normal 


the effort to eliminate a 
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range” (i.e., having no T > 70) behaved dif- 
ferently enough so that a special rule employ- 
ing the Welsh internalization ratio (Welsh, 
1952) was applied to them. A special rule 
was also invented for mild “fake good” (L 
= 60) profiles. 

These searching procedures resulted in a 
set of 13 rules, arranged in sequence so that 
in applying the system one reads through the 
list of rules in order until he comes to one 
which covers the profile under consideration. 
Application of the rule to this profile then re- 
sults in one of four outcomes: Profile “psy- 
chotic” (P), profile “neurotic” (N), profile 
indeterminate (1), or “proceed to next rele- 
vant rule.” In this manner all profiles are 
finally classifiable under one of the first three 
rubrics P, N, or I. 

This provisional set of rules was then tried 
out on a preliminary cross-validation sample. 
Because of the great temptation to capitalize 
upon sampling errors in such complicated and 
variable searching procedures as those em- 
ployed, one expects a marked degree of 
shrinkage upon cross-validation, and there- 
fore this first cross-validation sample was con- 
ceived of only partly as a check on whether 
we were getting anything but chiefly as a 
means of modifying the rules by the elimi- 
nation of ad hoc subrules or adjustment of 
cutting scores. The preliminary cross-valida- 
tion sample consisted of 140 file cases of 
white males aged 18-65 whose MMPI profiles 
were presumed valid by the 
?--F criteria 


remission so 


hove-mentioned 
tested in 
far as could be judged from the 
staff notes and other data in the chart. None 
of these patients had received shock therapy 
prior to testing, and most were tested within 
two to three days of admission. Ninety-two 
of the cases originated from the psychiatric 
unit of the University of Minnesota Hospitals 
and 48 from the Minneapolis VA Hospital. 
All were inpatients. The distribution of diag- 
noses from the combined criterion and pre- 
liminary cross-validation samples (N = 402) 
is shown in Table 1. 

Prior to application of the rules, one of us 


and who were not 


(PEM) read in random order the case sum- 
maries of the 92 cases from the university 
hospital files, deleting material on the psycho- 
logical test data, the sections presenting dif- 


TABLE 1 


DISTRIBUTION OF DIAGNOSES ON 402 
DERIVATION CASES 
Original criterion plus preliminary 


yss-validation) 


Hypochondriasis 
PN mixed 
Anxiety neurosis 


21.4 
18.4 
18.9 
15.4 
14.4 
Obsessive—compulsiv: 8.0 
Other 2.5 
Neurasthenia 1.0 


Reactive depressior 
Hysteria 


100.0 


Schizophreni: 
Paranoid 
Simple 
Hebephreni 
Catatoni 
Mixed 
Other 

Paranoid Stat: 10.9 

Paranoia 1.0 

Manic-depressive manic 94 

Manic-dey 

Involut 


ressive depressed 13.9 


11.5 


1 " 
onal Psy CNnosis 


100.0 


diagnostic 


official 


ferential considerations, and the 
final diagnosis and prognosis. The 
reader made his own diagnosis on the basis 
of this The purpose of this pro- 
cedure was to provide information on cri- 
terion trustworthiness in cases where one of 
the preliminary rules yielded a test miss and 
a decision had to be made as to whether this 
particular miss was real or apparent. If the 
case reader agreed unreservedly with the psy- 
chiatric staff and the latter’s diagnostic sum- 
mary did not raise any doubts as to the diag- 
nosis, a more persistent effort was made to 
modify the preliminary rule so as to avoid 
the miss than if he disagreed with the staff. 
We were particularly concerned to detect 
those cases in which the case reader diag- 


reading. 
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nosed psychosis in opposition to a psychoneu- 
rotic label attached by the staff, but in which 
the staff diagnostic summary also included 
mention of the likelihood of such subclinical 
diagnoses as, e.g., “incipient schizophrenia” 
or “strong cyclothymic element.” 

In the preliminary cross-validation sample 
the MMPI profiles were again impressionisti- 
cally sorted as “psychotic” or “neurotic” on 
two occasions (separated by several months’ 
time and randomizing their order), and the 
consistency or disagreement of the two sort- 
ings was utilized in study of the individual 
test misses. 

As expected, the shrinkage on cross-valida- 
tion was pronounced. In the criterion sample, 
13% of cases were classified indeterminate, 
and of the remaining 87% of cases for whom 
a decision was provided by the preliminary 
rules, the hit-rate was 89% (for both the 
neurotic and psychotic subpopulations). On 
preliminary cross-validation, the proportion 
of indeterminate profiles does not change 
significantly (109%), but the hit-rate falls 
dramatically. The hit-rate for neurotics is 
only 47%, for psychotics 63%. The overall 
hit-rate is 55%, yielding a hit-rate of only 
61% among the 126 cases for which a de- 
terminate classification was made. We there- 
fore see a decline of almost 30% in the hit- 
rate among determinate cases in moving to 
this new sample. The cross-valid hit-rate, 
while statistically significant, is pragmatically 
unimpressive. 

Modification of rules, addition of three new 
rules, and slight alterations in rule sequence 
were made on the basis of these results. Again 
the procedures were complex and variable, 
and concentrated attention upon those rules 
in which the hit-rate was poorest. Seven of 
the original 13 rules were in this manner sub- 
jected to some kind of modification, ranging 
from slight adjustment of a cutting score to 
fairly radical revision. Constant back-refer- 
ence was made during this phase to the effect 
of proposed modifications upon the hit-rate 
in the original 262 criterion cases. 

One interesting result of these procedures 
was the progressive displacement of the origi- 
nally emphasized band rules to the end of the 
rule sequence. The patterns of six scales rep- 
resented by Beta and Delta only begin to 
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function as powerful discriminators after the 
curves have initially been divided into ma- 
jor “types.” One is reminded here of the point 
made by Block (1957) in another context, 
that the kind of R covariation which one is 
likely to discover by traditional individual- 
differences methods may be quite misleading 
unless the population has first been divided 
into subpopulations by Q covariation analy- 
sis, the R covariation patterns being some- 
times very different within the several sub- 
population “types.” 

The modifications made resulted in a drop 
in the criterion group from 87% to 77% de- 
terminate classifications with a maintenance 
of 89% in the 
cross-validation group there was a decline 
from 90% to 74° determinate classifications, 
but an associated increase in the hit-rate from 
61% to 83% among those classified. For the 
total augmented criterion group (N = 402), 
we have rules which leave 24% of the cases 
indeterminate but give us an 87% confidence 
when a classification is made. Thus the move- 
ment in modification was in the direction of 
increased recognition of nondifferentiating pat- 
terns, liquidating ad hoc rules which served 
to classify special criterion cases but which 
collapsed on cross-validation. The final set of 
rules is presented in Appendix A.? 


hits among cases classified: 


Cross-V ALIDATION 


cross-validation of the modified 
rules we attempted to secure a sample which 
would be both large and sufficiently diverse 


For final 


(as to clinical populations and staff diagnostic 
practices) to provide some information about 
validity generalization. Through the kindness 


of several clinical psychologists in providing 


MMPI data we were able to obtain eight dif- 
ferent samples of cross-validation cases from 


2A five-page statement of the rules has been de- 
posited with the American Documentation Institute 
Order Document No. 6330 from ADI Auxiliary Pub 
lications Project, Photoduplication Service, Library 
of Congress; Washington 25, D. C 
vance $1.25 for microfilm or 
Make checks payable to Chief, Photoduplication 
Service, Library of Congress. Mimeographed copies 
of the rules may be obtained from Paul E. Meehl, 
Box 390 Medical University of Minnesota 
Minneapolis 14, Minnesota 


, remitting in ad- 
$1.25 for photocopies 


“~<( hool 





Discriminating Psychotic from Neurotic MMPI Profiles 381 


various clinical installations over the country.* 
The sample sizes varied from 42 to 273, with 
a median of 97 cases per sample and a total 
N = 988. A more detailed characterization of 
these samples is contained in Appendix B.* 
The samples vary as to nature of the popula- 
tion (VA and non-VA, outpatient and inpa- 
tient). They are geographically dispersed, and 
so far as known to us there is considerable 
variation as to the predominant local theo- 
retical and diagnostic orientation. All, how- 
ever, are males. They vary widely in the diag- 
nostic “purity” of the cases. In three samples 
(A, B, E) the criterion is completely uncon- 
taminated, the MMPI having been unavail- 
able to the diagnosing clinicians. Ai fourth 
(K) is effectively uncontaminated since, al- 
though MMPI results were available, the pa- 
tient population sampled consisted of psy- 
a state hospital where the daily 
census of neurotic diagnoses runs only 1-2%. 
The remaining four samples (C, D, F, G) 
suffer from unknown but nonnegligible con- 
tamination 


chotics in 


Hit-rates do not differ as between 
the contaminated and uncontaminated cases 
(x 14s, 2 &, 2 20). Tables 2 and 3 
summarize the results of applying the rules 
to these eight samples 

The H:M:1I distribution does not vary sig- 
nificantly over the seven neurotic categories 
(omitting the single phobic reaction, ,’ 
20.84, 14 df, .10< p< .20). Pooling all 
H:M:I does not vary over 
8.08, p > .80). Al- 
though inspection suggests that the affective 


schizophrenics 
psychotic diagnoses (,’ 


We wish to express ou 


lowing 


r indebtedness to the fol 
psychologists who were of great a 
kind efforts in making 
to us for reanalysis or for tak 
ing the pains to track down diagnoses, 
and other information on the crit 
validation samples: H. R. Albrecht, VA Hos 
Chillicothe; George Guthrie, Pennsylvania 
» College; Howard F. Hunt, University of Chi- 
Thomas Kiresuk, Minneapolis General Hos 
Timothy F. Leary, Kaiser Foundation, Oak 
and, California James C. Lingoes, Langley Porter 
Donald R. Peterson, University of Illinois 
Albert Rosen, University of Maryland; Harold 
Rubin, VA Mental Hygiene Clinic, Philadelphia; 
Ranald M. Wolfe, Chillicothe VA Hospital 
‘A four-page summary of the sour 
teristics of the 


clinical 
sistance to us throug new 
their data availabk 
original re¢ 
rion and 


Clinic ; 


es and charac 
samples has 
Documentation 


ilidation 
rican 


eight cross-\ 
been deposited with the Am« 
Institute (See Fn 


disorders as a group are harder to identify 
than the schizophrenics, this trend does not 
quite achieve statistical significance (,* = 
5.50, .05< p< .10). Further checking on 
larger numbers of involutionals and manic- 
depressives is indicated especially since among 
the decidable cases, hit-rates (H/H + M) de 
differ as between schizophrenics and affectives 
(.75 versus .54, x? = 4.97, 1 df, 02 <p< 
.05). That is, when we ignore indeterminate 
curves, the affective cases are more often mis- 
identified as neurotic. Qualitative study of the 
missed affective cases suggests that criterion 
error or patient change accounts for some of 
these misses. 

The variation of H: M:I over schizophrenic 
subcategories fails to reach statistical signifi- 
cance (x* = 14.67, 10 df, .10 < p< .20). It 
is interesting to note that the best results, in 
terms of high total hits, low indeterminate 
rate, and high confidence among classifiables, 
occur in the 


somatization 
As we move into neurotic categories 
where dysphoria, failure of the defense, intel- 
lectualizing and other obsessional mechanisms, 
conscious guilt, and “bad” self-concepts are 
more in evidence, the profiles increasingly re- 


conversion and 


groups 


semble those of the psychotic group. 
The H: M:1I distribution does not differ sig- 
nificantly as between all psychotics and all 
neurotics (x? = 2.39, 2 df, p 
sidering only the 691 determinate-curve cases, 
neurotics and psychotics do not differ in hit- 
rate (x? = 2.35, 1 df, 10< p< .20). 
The variation in H:M:I1 over the eight 
samples is significant (y* = 43.14, 14 df, p 
001). One sample, D (Rubin’s Chillicothe 
data), fails to exceed chance differentiation 
among classified cases. All samples do show 
H > M, the ratios varying from 1.6 to 15.6. 
The hit-rates as between neurotic and psy- 
chotic cases do not differ in seven of the sam- 
ples, but the 19% higher identification of neu- 
rotics in Sample G is at the .001 level 
Whether one is heartened or discouraged 
by these results will-depend both upon his 
clinical expectations and his methodological 
orientation. If we concentrate upon the fact 
that the total number of correct classifications 
in the entire sample is only 53.3%, 


> 30). Con- 


we inay 
not be much impressed. If, however, we ac- 
cept the fact that some curves are ambiguous 
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and ought not to be given any appreciable 
weight in making a decision (an attitude 
toward laboratory tests to which our medi- 
cal colleagues have become thoroughly ac- 
customed), and therefore attend chiefly to 
the subset of cases for which the rules pro- 
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vide a decision, we see that the ratio of hits 
to misses among decided curves is 3.2 to 1. 
When a curve is classified on the basis of the 
rules, the classification has attached to it a 
confidence of Assuming the MMPI to be 
in use in a clinic for whatever (multiple) pur- 


76. 


rABLE 


COMPARATIVE RESULTS IAGNOSIS 
N 


IN 


Diagonsis 


Psychoneurosis or psycho 


somatic, unspecified 
Anxiety 


Conversion or somatiz 
tion 


Depression 
Mixed 


Other (e.g., passiv: 
aggressive, neur 
character) 


Obsessive-compuls 
Hypochondriasis 
Phobic Reaction 


Schizophrenia parar 


Schizophrenia mi 
unspecified 
Schizophrenia latent « 
remission 
Schizophrenia hebephreni 
Schizophrenia simp| 
Schizophrenia catatoni: 
Psychosis unspecified 
Manic-depressive depr 
Manic-depressive mani 
Manic-depressive mixed 
Involutional psychosis 
Paranoid condition 


1 


Psychosis with psy« 
pathic person: 


100.0 
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TABLE 3 


COMPARATIVE RESULTS IN THE 


Criterion 
c /, psy hotic 


poses it is deemed appropriate, the minute or 
less of clerical time required to apply the 
Meehl-Dahlstrom rules is not an unjustifiable 
expenditure of effort to obtain this much ad- 
ditional subset of 
That the concurrent validity compares 
favorably with the pooled weighted judgments 
of 29 Minnesota clinicians (and is better than 
any of them taken individually) would seem 
to justify substitution of the rules for impres- 
sionistic profile assessment with respect to the 
psychoticism variable. These patterns also dis- 
criminate better than any of five other “actu- 
arial” methods, including the linear discrimi- 
nant function (Meehl, 1959a). 


information on a sizable 


cases 


One of the present samples (Palo Alto, 


Sample B) does provide sufficient uncontami- 
nated information to make feasible certain 
within-diagnosis comparisons which shed a 
little light upon construct validity. Through 
the kind cooperation of Howard Hunt, we 


were able to obtain access to a one-page psy- 
chological summary sheet for each of the pa- 
tients in that sample. This summary, in addi- 
tion to the usual face-sheet data, also included 


varying amounts of information regarding 
diagnostic considerations (e.g., impression of 
the admission board, number of previous ad- 
missions, type of service discharge, compensa- 
tion, hospital status at the time of testing, 
Rorschach or Wechsler-Bellevue findings). 
One of us (PEM) read over each of these 
summary sheets (uncontaminated by knowl- 
edge of the associated MMPI profile) and 
made a subjective judgment in six steps as 


E1GHT CRrOsS-VALIDATION SAMPLES 


to the strength or clarity of evidence for and 
against psychotic tendencies. To get a rating 
of ++ for “very clear psychosis,” the data 
had to include the admission board impres- 
sion and a Rorschach diagnosis, neither of 
which considered any alternative. Semiobjec- 
tive rules were set up for lesser degrees of 
clarity or amount of information running 
through +, = (“neurotic but debatable’), 
—, and —— (“very clear neurosis, consider- 
able data given”). Any case marked “in re- 
mission” was automatically considered ques- 
tionable. Cases with disharmony between ad- 
mission board impression and final diagnosis 
were automatically considered questionable. 
For the 44 cases judged as either “very clear” 
or “clear,” the hit-rate was 68% (67% and 
69% for the two levels of clarity, respec- 
tively); whereas for the 13 cases considered 
“doubtful,” only four were correctly classi- 
fied, corresponding to a hit-rate of 31% (,’ 
= 5.84, p < .02). That is, when the admis- 
sion board was impressed with some behavior 
not reflected in the formal diagnosis, or the 
psychiatric staff was moved to record a sec- 
ondary diagnosis, or the psychologist giving 
the Rorschach found evidence running coun- 
ter to the diagnosis administratively assigned, 
such a patient was more likely to produce an 
MMPI profile out of harmony with the offi- 
cial diagnosis than was true for patients in 
whom no such inconsistencies 
dence. 


were in evi- 


The relation between construct and concur- 
rent validity in situations of this type is too 
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complex and involves too much methodologi- 
cal controversy to be developed here (see 
Meehl, 1959b). From the standpoint of con- 
current validity, the Palo Alto findings just 
described can be interpreted either favorably 
or unfavorably, depending upon one’s prag- 
matic emphasis. From a traditional viewpoint, 
the subjective rating based on the face-sheet 
data may be viewed as a “criterion” (al- 
though highly unreliable); the diagnostic re- 
versals given by the pattern rules in this in- 
termediate range must then be conceived as 
validly reflecting the “mixed” character of the 
patient’s behavior, certain aspects of which 
unduly influence the psychiatric staff respon- 
sible for the formal diagnosis while other as- 
pects are conveyed by the face-sheet informa- 
tion and therefore reflected in the “clarity” 
ratings of the case reader. From another point 
of view, one can argue clinically that the prac- 
tical utility of a psychometric instrument 
varies inversely as the clarity with which the 
patient’s diagnosis can be pegged without the 
instrument, so that the finding of high agree- 
ment for clear-cut cases and poor agreement 
for borderline cases is precisely the reverse of 
what might be desired in clinical practice. The 
rationale of this second position involves an 
implicit commitment to the idea of construct 
validity and brings us to our own preferred 
mode of thinking about such internal rela- 
tionships. The construct validity of the pro- 
file is its power to reveal the internal psycho- 
logical structure and state of the patient 
which it does fallibly and probabilistically, as 
do the other indicators available (inchiding 
the social impact of the patient upon the 
diagnosing psychiatrist). To say that one 
wants the test precisely for cases where the 
diagnosis is otherwise difficult, is to say that 
“mixed” behavior output leaves one in doubt 
as to the inner psychological condition, and 
it is desired to use the test as an aid in as- 
sessing this inferred internal state of affairs. 
This way of conceptualizing the situation 
makes it unfeasible to refine the validation 
procedure further unless the quantitative and 
qualitative character of the additional data 
(other than formal diagnosis) is superior to 
that routinely available. 

Since the kind and degree of construct va- 
lidity remain to be established and the pres- 
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ent investigation deals only with the most 
obvious among the indefinite family of con- 
current and predictive validities which the 
patterns may possess, a vexing problem of 
terminology obtrudes itself. How shall the 
decidable profiles be labeled? In the present 
state of the evidence, we would favor a rather 
noncommittal language. The adjectives “psy- 
chotic” and “neurotic” might be carefully de- 
fined as referring merely to curve types, warn- 
ing against an automatic classification of the 
patient, but one realizes that such hygienic 
semantic provisions do not always achieve 
their aim in practical usage. Furthermore, the 
term “psychotic” has itself a rich network of 
associations, valuable for theoretical purposes 
and in the design of subsequent research but 
which in daily clinical practice should not be 
linked too closely to the curve form itself. 
We a terminological convention 
that the neurotic curve type as identified by 
these rules be called an “N-curve,” and the 
psychotic curve type be referred to as a “P- 
curve.” The construct, concurrent, and pre- 
dictive validities associated with these types 
and the subtypes (rules) by which the broader 
groups are identified then remain to be de- 
termined. This approach is consistent, at the 
profile level, with our preference for the use 
of digits in individual MMPI 
keys. An alternative and theoretically neutral 
locution would be “first zone’ (=N) 
‘third zone” (=P), reserving “se< 
for the conduct disorder types, 


propose as 


designating 


and 
ond zone’ 
adopting a 
in the col 
loidal gold test of clinical neurology. 

We hope that a “bootstraps effect” has 
been achieved by the identification of these 


zone” language as is done, e.g., 


configurations. From a construct validity view- 


point, the phenomenon of a “test miss” can 
be very instructive and in the long run may 
be more productive of understanding and im- 
provement in assessment procedures than an 
“hit. that a patient pre- 
sents, on formal status examination 
and other gathered from in- 
formants and ward personnel, a nonpsychotic 
clinical picture. Yet, as happens with consid- 
erable frequency, the MMPI pattern is a 
schizophrenic P-type. Assume exclusion of ob- 
viously invalid or essentially “chance” re- 
sponse patterns readily detectable by present 


obvious suppose 
mental 


behavior data 
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methods. What are we to think about this pa- 
tient? Setting aside questions of evaluating 
the test, one can contemplate this situation in 
terms of the patient’s psychology. The pa- 
tient, when presented in a standard manner 
with 550 verbal stimuli, has responded to 
them systematically, in a way which is “sta- 
tistically congruent” with the response pat- 
terns of patients who were not clinically clear 
of psychotic behavior but rather identified by 
nontest criteria as diagnosible schizophrenics. 
This systematic pattern of responding cannot 
be dismissed. In any adequate account of the 
psychometric situation, it, will have to be fitted 
into some kind of consistent causal analysis 
When out-and-out faking or lying has been 
excluded, the verbal “true” and 
this standardized set of verbal 
stimuli must be psychologically construed as 
phenomenological 
quacy. It 
reports, 


responses 
‘false” to 
reports of varying ade- 
is of course not assumed that the 
insofar as they have a content refer- 
ring to behavioral dispositions or the facts of 
the world or of other people’s conduct, are 
(Meehl, 1945). What is, however, 
once we have excluded clear fake 
that the patient is reporting, 
within the limitations of a prespecified domain 


correct”’ 
assumed 


rec ords is 


the character of 
his phenomenology at the time of testing. We 
then have a paradox. Here is a patient whose 
MMPI-phenomenology is 


acteristic’ 


and fixed-response context 


“statistically char- 
of clinically obvious schizophrenia 
but whose behavior in a diagnostic interview, 
on the ward, and in recent extramural con- 
texts is not clinically schizophrenic. Such a 
patient must be psychologically different from 
both the clinically schizophrenic patient with 
1 third zone curve and the neurotic 
His verbal 
to these items is not something detached from 
the world and utterly uninterpretable without 
nontest data. After all, the test has elicited a 
sizeable mass of verbal behavior, and if previ- 


patient 


with a first zone curve response 


ous evidence enables us to say “what is a 


schizophrenic these 
the fact that a patient has responded 
thus is a weighty piece of evidence regarding 
him and must be 
often 
plained away.” 

There are dangers involved in this kind of 


way of responding to 


items,” 


“explained” rather than, as 


is so done in staff conferences, “ex- 
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thinking for those who would irresponsibly 
defend an instrument to which they are 
strongly committed, effectively cutting them- 
selves off from all possibility of refutation; 
but this danger should not mislead us into 
undervaluing the evidential weight of a psy- 
chometric pattern. An adequate understand- 
ing of the class of patients defined as test- 
misses from the concurrent validity stand- 
point must include a psychological account of 
the presence of their schizophrenic phenome- 
nology, when their social and other behavior 
is apparently that of the neurotic group. The 
development of such an account would seem 
to hinge upon the 
data on as 


accumulation of further 
different kinds of concur- 
rent, predictive, and content validity as the 
ingenuity of clinical investigators can devise. 
Examples which come readily to mind would 
be the following: In a group of patients all 
officially diagnosed as psychoneurotic, what 
is the relationship between the incidence of 
schizophrenic Rorschach indicators and the 
presence of P-curves of schizoid type? If we 
assume that some so-called “neurotic depres- 
sive reactions’ are actually phases of a 
“damped” endogenous manic-depressive cy- 
cle, are there nontest indicators (e.g., a his- 
tory of recurring depressions in the patient, or 
the finding of depressions and suicides in his 
family history) associated with “neurotic” 
depressions manifesting P-profiles of the de- 
pressive type? What is the relationship be- 
tween curve type and such physiological indi- 
cators as the Funkenstein reaction (Funken- 
stein, Greenblatt, & Solomon, 1952) or the 
sedation threshold (Shagass & Jones, 1958) ? 
If we classify patients on the basis of curve 
type, and then for each curve group, regard- 
less of the clinical diagnosis officially given at 
the time of initial study, plot cumulative sub- 
sequent hospitalization over a long time span, 


many 


how do the slope and asymptote constants of 
these curves compare? Suppose that a group 
of patients seen in an outpatient setting are 
QO sorted and OQ correlations with an idealized 
Q sort description of “pseudo-neurotic schizo- 
phrenia” are computed. Considering a sample 
of patients all diagnosed as neurotic, how do 
the distributions of these O correlations com- 
pare as between neurotic patients having a 
P- versus those having an N-profile? It is by 
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accumulation of investigations of this type, 
preferably in the context of at least a sketch 
of theory as to the nature of the psychotic 
disorders, that an adequate picture of the 
construct validity of profile pattern signs will 
have slowly to be achieved. 


SUMMARY 


By a combination of statistical searching 
and codification of clinical experience, a set 
of objective profile signs has been evolved as 
an aid in the discrimination between “neu- 
rotic” and “psychotic” MMPI profile pat- 
terns. These signs were then applied to the 
profiles of eight cross-validation samples (V 
= 988) from a diversity of clinical installa- 
tions involving varying degrees of contami- 
nation from essentially none for four samples 
to unspecifiably high. Approximately 30% of 
the cases present profiles which are classified 
as indeterminate with regard to the psychotic- 
neurotic distinction. Among classifiable cases, 
the concurrent validity hit-rate varied from 
a low of 61% to a high of 93% in the eight 
samples, with a median of 73% hits, and a 
hit:miss ratio of 3.2:1 for the total pooled 
sample. It is suggested that research with 
these patterns be directed to the elaboration 
of their construct validity and to the psycho- 
logical understanding of the phenomenology 
and dynamics of cases which are “‘test-misses” 
from the concurrent validity standpoint. It is 
also suggested that curves characteristic of 
those found in diagnosed psychoneurotics be 
designated simply as N-type (or “first zone’’) 
profiles, in contradistinction to P-type (“third 
zone”) profiles manifesting the configuration 
typical of psychotic patients. The rules, while 
complex and of a somewhat forbidding as- 
pect, can, after a little practice, be applied by 
a clerk in less than a minute’s time. 
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Attempts to determine the effectiveness of 
psychotherapy have used many different cri- 
teria to assess the extent of change and there 
has been considerable discussion of what vari- 
ables constitute the most adequate measures. 
Some researchers have solved the problem of 
having to decide between a number of avail- 
able criteria, none of which is completely 
satisfactory, by combining several measures 
into a composite criterion. Assumptions im- 
plicit in this approach are that psychotherapy 
brings about a general and unitary person- 
ality change and that all criteria are imper- 
fect measures of this change. Other research- 
ers have insisted that the effects of psycho- 
therapy may be very specific and that the 
criterion for evaluation of therapeutic out- 
come should bear a direct theoretical relation- 
ship to the underlying theory of therapy. It 
is important to know the degree of generality 
of therapeutic outcomes, both in making de- 
cisions concerning appropriate criteria for re- 
search, and in determining whether specific 
therapies may be needed for specific prob- 
lems or whether general psychotherapeutic 
techniques may be applicable to a wide va- 
riety of problems. 

Two previous factorial studies of therapy 
change (Cartwright & Roth, 1957; Gibson, 
Snyder, & Ray, 1955) suggest that person- 
ality change is multidimensional and that the 
factors tend to be organized around the vari- 
ous instruments used to measure change. It 
is, however, erroneous to interpret this as 
indicating that the effects of psychotherapy 
are themselves multidimensional. The therapy 
is certainly not the only independent variable 
operating to produce change in a group of 
subjects (Ss) undergoing therapy. Thus, to 
study the dimensionality of therapy change, 


1 This study was supported by a grant (XR 1916) 
from the Purdue Research Foundation 


it is necessary not only to determine the fac- 
tor structure of the change measures, but also 
to establish which, if any, of the change fac- 
tors are related to the amount of therapy re- 
ceived. 

The ideal design for dealing with this prob- 
lem would be to obtain a variety of measures 
of personality change from a large number of 
Ss undergoing psychotherapy and to obtain 
the same change measures from a matched 
group of control Ss over a comparable wait- 
ing period. A comparison of the factor struc- 
ture of the change measures in the therapy 
group with that in the control group would 
indicate whether or not therapy has a general 
or group effect which would tend to increase 
the correlations among groups of change 
measures in the therapy group. Then com- 
parisons of the amount of change on the vari- 
and control groups 
would indicate which change factors are af- 
fected by therapy and which are not. Un- 
fortunately, this paper is not a report of the 
results of this ideal study, but rather is an at- 
tempt to answer some of the same questions 
with somewhat less adequate data. 

In this study, a number of therapy change 
scores are factored in order to determine the 
dimensions along which change occurs in the 


ous factors in therapy 


these dimen- 
sions in the therapy group are compared with 


measures used. Changes along 
changes in a comparable control group in or- 
der to determine which of the changes can be 
attributed to the therapy. Additional evidence 
on this point is obtained from the relation 
ship of the changes in the therapy group to 
the amount and intensity of the therapy 
METHOD 
Subjects 


During the 1957-58 and 1958-59 academi 
75 undergraduate students were 


years, 
seen at the Purdue 
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Psychological Clinic for at least five psychothera- 
peutic interviews and completed the California Psy- 
chological Inventory (CPI) and a sentence comple- 
tion test before the first and after the last interview 
At the end of therapy they also completed a brief 
rating scale, a portion of which is shown in Table 1 
and the therapist filled out a somewhat longer rating 
which included items corresponding to 
those shown in Table 1. There were 57 male and 18 
female clients. Approximately half were self-referred 
to the clinic while the 
the dean’s office, the 

campus agencies. Difficulties presented in the 
interview ranged from 


scale four 


other half were referred from 
student health service, or other 
intake 
alcoholism and homosexuality 
to lack of motivation study most common 
complaints were diffict s ir terpersonal relation- 
ships, difficulties in school wo 
of the 
were so sick that they had to 
during the course of theray The 
therapy interviews 


family or au- 
cases included in this 
school 
mean number of 
standard devia- 


tonomy problems 


study leave 


tion was 9.2 

These Ss were seen for verapy by 
fourth-year graduate s its were enrolled in 
a two-semester pr icum cours There were 41 
therapists, most I om saw WW of th 
Ss. Most of the therapists h 


third- and 


res¢ arch 
had some previous 
experience with psychotherapy, but for a few this 
was their first direct therapy experience. ; of the 
therapy was closel) 
supervisory 


supervis 

conference 

each hour of theray 

through a one-w 

n aid to 1] 10! n 

once weekly, but \ ere n two or re¢ 

a week. The 
therapist 

were in the dire 

interpersonal rel 


times 


ynsiderably 


} 


cussion of feelings anc 


lient’s gaining insight 
When a client cams 
an initial intake interview by a 


and 


graduate assistant 
an appointment was made to take the CPI and 
the sentence completic t r the tests were 
completed the client 
with his supervis 
the cas T he 


f the spring semest 


» a therapist who 
in complete 


clini osed yr the 


harge ot 


summer at the end 
aused the rmination 
if therapy for 58 of th ases included in the 
present study. The 


tarily some 
interview the 


mester 
was contacted by letter or phone by the ex 


ases terminated volun 
time befor he end of the 
Following the la thi 


spring 
rapy 
client t 
perimenters (Es) and asked come in to take the 
These administered by the Es 
and the clients did not their therapists at this 
time 

As might be was often difficult to get 
all of the required information on every client. Of 
157 clients who took the initial 
the requirements of the study 
cluded were 


posttests 


expec ted it 


tests only 75 met 
Most ot those not in 
} 


eliminated because of an _ insufficient 
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number of interviews. However, a number were not 
included because they dropped out of school or for 
some other reason did not take all of the tests 


Factor Analysis of Change Scores 


For each of the 75 Ss, 30 change scores were avail- 
able. The CPI was scored for the 18 standard scales 
described in the CPI manual (Gough, 1957) and 
change scores were calculated by subtracting the pre 
test score from the corresponding posttest score. The 
sentence completion test consists of 55 
veloped by Rosenberg 


(1957), 
sents a conflict situation to be 
was scored 


stems de 
which pre 
The test 
according to an unpublished system de- 
veloped by the 
ciliation of th 


each of 
reconciled 
junior author in which the Ss’ recon- 
conflict presented by each stem was 
classified as 
firm 


either active or passive and as either 


(dealing directly with the conflict situation) or 


soft (avoiding the conflict ion). The combina 


tions of these ratings summ ver all items yielded 
four scores: active-soft (AS), active-firm (AF), pas 
sive-soft (PS), and passive-firm (PF) 
reliability was found to be .87 for AS, .75 
64 for PS, and .49 for by 


a shortened form 


Interscore! 

for AF 
Ebner (1958) using 
of the te and another college stu 
Four sentence « 


were calculated by 


dent sample ympletion change scores 
subtracting the 
the corresponding posttest score 


Four ratings of amount « 


the end of therapy by both 


the items shown in Table 


were inter rrelated and the matrix 


was factor analyzed y the pal component 
method with communalities timated as > high- 
in each row.” 

of the largest factor loadings an 1e size of th 
] 


latent roots it was decided to retain six factors whic 


est correlation sis of the size 


were ' 


orthog 


varimax criterio 


a 


THe Factors 


The findings ‘of this analysis are consistent 
with those of Cartwright and Roth (1957) 
and Gibson, Snyder, and Ray (1955) that the 
amount of change occurring with psycho- 
therapy depends to a large extent on the 
vantage point from which it is observed. Each 


2 The correlation matrix and the unrotated 
matrix have been deposited with the Am 
mentation Institute. Order Document N« 

ADI Auxiliary Publications Project, Photoduplic 
tion Service, Library of Congress; Washington 

D. C., remitting in $1.25 for microfilm 
$1.25 for Make checks payab t 
Chief, Service, 


gress 


advance 
photoc« pies 
Photoduplication Library of Con 

‘These calculations were done on Purdue’s Data 
tron 205 computer 
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rABLE 1 


RATING SCALE FILLED OUT BY 


Weare trying to evaluate the effectiveness of the work of the Psychological ( 
Your frank answers and comments to the items below 
On the following four items rate the amount of change, if any, which has occurr 
Consider all changes whether due to coming to the clinic or not. I 


services offered 


to the clinic. 
mark by the appropriate number. 


1. Amount of change in the symptoms or complaint 

1 2 3 
worse than when no change 
I came 


some 


) 


Amount of change in your underst 


1 2 3 


more confused no change 


about myself now 


3. Amount of change in feeling and general outlook or 
1 2 3 


feel worse no change 


better 


4 ge considerir 


Overall rating of amount of chang 
9 


1 2 


3 


change for no change 
the worse for the be 


Note Ther te 
4 was reversed on the ; 
the scales in the above form. 


instrument used to assess change defined its 
own factor or factors. 

Factor A is clearly a CPI factor with high 
loadings on Wb, Re, Sc, To, Gi, and Ac. 
Since the Wb and Gi scales were constructed 
to detect faking, this factor may be tenta- 
tively interpreted as representing change in 
the tendency to present oneself in the best 
light on inventory items. The other scales 
loading on this factor represent changes in 
self-description in regard to maturity, con- 
scientiousness, permissiveness, and depend- 
ability. However, the profiles presented by 
Gough (1957) of Ss instructed to fake bad 
show a marked lowering of the scores on 
these scales. As a matter of fact, the differ- 
ences between the 7 scores of Ss faking good 
and Ss faking bad reported by Gough (1957) 
are, with the exception of the Cm scale, pro- 


feel some 


+t 


THE THERAPY GR 


improvement 


ling of yourself and 


some better 


some changé 


OUP TREATMENT 


nan attempt to improve the 
will greatly help us in doing this 

in you since you first came 
cate your rating by a check 


s which brought you to th 


here 


] hot 
» longer pot 


by syt 


your 
1y ] 


understand myself 


life. 


g everything that 


portional to the corresponding factor loadings 
on the present factor. 

If the interpretation of this factor as a fak- 
ing or social desirability factor is correct, it 
is surprising that the client’s direct ratings of 
change do not load on it. Berhaps this is be- 
cause the CPI scores are di,ference scores be- 
tween pretest and posttest and would thus be 
sensitive to any change in tendency to pre- 
sent oneself in a good light, while the client 
ratings are made only at the end of therapy 
when the client may be aware of this 
change in self-description or may not consider 
it in his ratings. 

Factor B is clearly a therapist rating fac 
tor. In addition to the therapist ratings, the 
client rating of change in understanding loads 
on this factor. What is surprising is that the 
therapist ratings identify a factor which is so 


not 
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little related to test score changes and client 
ratings. The loading of client rating of change 
in understanding may represent some real 
agreement between therapist and client. How- 
ever, since any changes achieved were pre- 
sumably discussed in the final interviews, one 
would reasonably expect more agreement than 
this. 

Factor C is another CPI factor on which 
Do, Sc, Sy, Sp, and Sa have the major load- 
ings. There are lower loadings for Wb, To, 


Ac, Ai, Ie, and PS 


The highest loading CPI 
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scales on this factor are grouped together by 
Gough (1957) as measures of poise, ascend- 
ency, and self-assurance. They differ from 
those loading primarily on Factor A in that 
they all seem to be concerned with ease of 
social interaction and comfortableness with 


others, where the scales loading on Factor A 
are more concerned with responsibility and 
self-control. 

Factor D is a client rating factor and has 
generally negligible loadings for all other vari- 
ables. There is a low loading for the therapist 


TABLE 2 


RotTatep FActorR MATRIX 


Change Scores 


CPI Change 


Do Dominance 
Cs Capacity for s 
Sy Sociability 
Social presen 
elt -acceptan 
b Se nse of we ] 
Responsibility 
Socializatior 
Self-control 
» Tolerance 
Gi Good 
Cm C 
Ac Acievement vii 
Ai Achievement vik 
Te-Intellectual eff 
Py Psychologica 
Fx Flexibility 


Fe Femininity 


impressior 


ommunality 


Sentence Completior 


AS Active-soft 
AF Active-firr 
PS Passive-soft 


PF Passive-firm 


Client Ratings of Change 
Symptoms and complaints 
Understanding of self 
Feeling and outlook on life 
Overall rating 


Therapist Ratings of Change 
I 


Symptoms and complaints 
Understanding of self 
Feeling and outlook on life 
Overall rating 


min 


~—= ee mw UI UT Ww 


“Iu 


n 


~ 
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rating of change in feeling. Surprise was ex- 
pressed when the therapist ratings seemed 
largely independent of the other measures of 
change, but such surprise is even more appro- 
priate in regard to the client ratings because 
the test score changes (especially the CPI) 
represent a kind of client rating in themselves. 
As was mentioned in discussing Factor A, the 
test score changes represent differences in per- 
formance at two different times and thus are 
sensitive to changes that the client may not 
be aware of at the end of therapy, since his 
rating of change depends on memory of his 
earlier state. It is interesting to note that the 
client’s rating of change in understanding has 
only a moderate loading on this factor, sug- 
gesting that improved understanding is not so 
important in determining the client’s evalua- 
tion of the outcome of therapy. Going back to 
Factor B, the therapist rating factor, for a 
moment, it is interesting that the therapist’s 
rating of change in feeling has the lowest 
loading among the therapist ratings and that 
it does have a small loading on Factor D 
This suggests that part of the lack of rela- 
tionship between the therapist and client rat- 
ings may be due to a difference in values 
rather than complete lack of agreement on 
what changes have occurred. The therapists 
seem to consider self-understanding more im- 
portant in their concept of improvement than 
does the client and conversely the client seems 
to value change in the way he feels somewhat 
more highly than does the therapist. Change 
in presenting symptoms seems to be about 
equally important for therapist and client. 

Factor E is defined mainly by loadings of 
the sentence completion scores and seems to 
represent change in the ratio of active to pas- 
sive solutions to the conflict presented in the 
sentence stem. The negative loading for Fe 
and the positive loading for Gi is consist- 
ent with the tentative interpretation of those 
sentence completion scores, but the negative 
loading of Re does not seem to fit into any 
reasonable interpretation. 

Factor F has Cm, Ie, Wb, Ac, and PF as 
high loading tests. Since Cm, the highest load- 
ing variable, is a rare response scale and Wb, 
the second highest loading variable, is a re- 
sponse distortion scale, perhaps this factor 
represents change in the care taken in re- 
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sponding to the test items. This would be 
consistent with the negative loading for PF 
since the PF sentence completions represent 
easy avoidance of a conflict situation and are 
fairly obviously not desirable responses. This 
interpretation is supported by the fact that 
on the profile presented by Gough (1957) of 
CPI items answered at random plotted on the 
male norms, the four scales deviating most 
from the mean are the Cm, Ie, Wb, and 
scales, and the rank order of their deviation 
is the same as the rank order of their factor 
loadings on the present factor. 

Nature the test 
One question that is of interest is whether or 
not the in the test 
changes are as those that would be 
An un 
published factor analysis of the 18 CPI scales 


Ac 


of score change factor 


factors identified score 
the same 
isolated from the original test scores 
done independently on different Ss was com- 
pared with the two CPI factors of the present 
study. Factor A is practically identical to the 
first factor of the CPI analysis and the cor- 
relation between factor loadings based on the 
18 CPI scales is 
to the 
a correlation between factor loadings of 
Thus, there is clea 

CPI change factors 
same influences that 


Factor C is very similar 
second factor of the CPI analysis with 

91 
ut evidence that the two 
represent changes in the 


ire responsible for co 


variance among the CPI scales the first 


in 
place. 


Correlations between the two CPI change 
factors (A and C) and the corresponding fac- 
tors in the pretests were estimated by averag- 
ing the correlations between the pretest scores 
and the corresponding change scores for the 
scales with the highest loadings for the fa 
tor using the z transformation. These correla- 
tions were 39 for Factor A and 44 for 
Factor C. This indicates that the lower (more 
maladjusted) the original the greater 
the change tends to be. Since most of the Ss 
in this study had initial scores below the mean 
for college student 


score 


these correlations could 
be interpreted as indicating regression toward 
the mean due to unreliability of the test 
Alternative negative correla- 
tions could be interpreted as indicating the 
greater tendency of the more maladjusted 
cases to improve, either because of therapy 
or spontaneously. The same regression effect 


scores. these 
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is present in the sentence completion scores 


as is indicated by the estimated correlation 
of —.46 between pretest score and change 
score on Factor E. 


CHANGES OCCURRING WITH THERAPY 


The above factor analysis has indicated the 
dimensions along which change occurs. How- 
ever, it does not tell us whether or not the 
mean changes in the Ss receiving therapy are 
different from zero nor does it tell us whether 
or not these changes have been brought about 
or facilitated by the therapy. 
Evaluation of Mean Change 


salient variables 
for each of the factors is shown in Table 3 
Every high loading variable for Factors A 
B, C, and D significant 
direction that would be 
The 


and F are not so clear-cut 


The mean change in the 


change in a 


considered to reflect 


shows 
improvement change for 
Factors E Three 
of the variables Factor E 
show significant change but one of these, Re, 
is in Only half of the 
high loading variables on Factor F 
nificant change, and this may 


patterns ol! 


four loading on 
the wrong directior 
show sig- 
be due to the 
other 
cannot be 


} 


loadings of these scales on 
Thus, Factors E and F 
show consistent changes 

These results that the 
whom received therapy 


factors 
said to 
75 Ss, all of 
considered as 
a group show definite improvement on the 
two CPI factors and on the therapist 
client rating factor 


show 


when 


and the 


Changes Related to the Therap 


Although the above 
improvement on four of six 
they do not 


results show significant 
factors, 
whether or not the 
improvement was produced by the therapy. 
There are, however, some comparisons within 


change 
indicate 


the present data that bear on this important 
question. One possible indication of the ef- 
fect of therapy on the various change fac- 
tors would be the the factors 
with the number of interviews on the assump- 


correlation of 


tion that the larger the number of interviews, 
the greater should be the therapeutic effect. 
These correlations can only be 
however, since the number of 


suggestive, 
interviews is 


TABLE 3 


SIGNIFICANCE TESTS OF CHANGE SCORES OF THE 
SALIENT VARIABLES ON Eacu Factor 


Variable Mean* 


actor A 
Wb 
Re 


Sx 


Symptoms 
Understandir 
I eeling 
Overall 


14.70** 
bt a 


13.95** 


confounded with time between tests for the 
test score changes and probably is confounded 
with the placebo effect for the ratings. Also 
there is probably a nonlinear relationship be 
tween number of interviews and success as 
has been suggested by Cartwright (1955). 


Nevertheless, there was considerable variance 
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in number of interviews (mean = 14.7, SD = 
9.2) and correlations between each factor and 
number of interviews were estimated by av- 
eraging the correlations of the highest loading 
variables with number of interviews using the 
z transformation. These combined correlations 
are shown in Table 4. With correlations based 
on 75 cases the correlation with Factor B 
(therapist rating) is the only one significant. 
Another indication of the amount of ther- 
apy received might be based on the skill of 
the therapist. There were 41 therapists, each 
of whom was carefully supervised. Correla- 
tions between supervisors’ rating of thera- 
peutic ability of the therapist and the vari- 
ous change factors were estimated by the 
procedure used above. These correlations are 
shown in Table 4. Rated skill of the thera- 
pist is significantly related to both the thera- 
pist and client rating of change. Some con- 
tamination of the supervisors’ rating is a pos- 
sibility since the supervisor’s judgment of the 
therapist may have been influenced by the 
progress of the case, but this artifact is at- 
tenuated by the fact that each of the thera- 
pists carried several cases. Even though all 
the cases were not included in this study, 
they were all considered by the supervisor in 
making his rating. Number of interviews was 
correlated a borderline significant .21 with su- 
pervisors’ rating of skill of the therapist. 


COMPARISON WITH A CONTROL GROUP 


On the basis of the above approximations 
it appears that the therapist and client rat- 
ings are the only measures of change that are 


TABLE 4 


CORRELATIONS OF THE CHANGE FACTOR SCORES WITH 
NUMBER OF INTERVIEWS AND SKILL OF THERAPIST 


Corrleation Correlation 
with with 
Number of Skill of 


Interviews Therapist 


Change 
Score 


Factor A 01 09 


Factor B .29** og 
Factor C , 07 
Factor D A! A ag 
Factor E 0S .16 
Factor F , 01 


** Significant at .01 level with one-tailed test. 


Robert C. Nichols and Karl W. Beck 


influenced by the psychotherapy. However, 
comparisons within a therapy group leave 
much to be desired as a means of evaluating 
therapy, and data for a nontherapy control 
group was collected to serve as a baseline for 
the evaluation of changes in Factors A, C, 
and D (three of the four factors showing sig- 
nificant changes in the therapy group). 


Considerations in Selecting Control Subjects 


One of the most difficult problems in the 
evaluation of the results of psychotherapy has 
been the securing of adequate untreated con- 
trol cases. Many investigators have omitted 
controls entirely. Some, such as Barron and 
Leary (1955), have used waiting list cases, 
and others, such as Rogers and Dymond 
(1954), have used a normal group of non- 
clinic Ss. 

Although these two studies represent the 
most adequate controls available for changes 
in self-ratings with therapy, some objections 
can be raised to both. Barron and Leary have 
pointed out that there may be considerable 
therapeutic effect in an initial intake inter- 
view and in being on a clinic list. They also 
suggest that self-rating tests may be affected 
by the relationship to the clinic and thera- 
pist, and these relationships are quite differ- 
ent for clinic and waiting list controls at the 
time of the posttest. The nonclinic control 
group used by Rogers and Dymond has the 
disadvantage of not being comparable to the 
therapy group in terms of maladjustment. On 
both of their self-rating measures, the self- 
ideal correlation and the O sort adjustment 
score, the nonclinic control group achieved 
higher mean scores on the pretest than any 
mean score ever achieved by the therapy 
group. Thus, if the tests have any ceiling ef- 
fect, the control group will be more affected 
than the therapy group 

In the present study an attempt was made 
to secure a nonclinic control group that was 
comparable to the therapy group in a num- 
ber of important respects. The control group 
used in this study had the following charac- 
teristics: (a) each therapy case was matched 
with a corresponding control case for sex of 
S and for initial scores on the criterion test 
(the CPI), (4) time between pretest and 
posttest for each control case was equivalent 
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to the time taken by therapy in the corre- 
sponding therapy case, (c) both therapy and 
control drawn from the same 
student population which is fairly homogene- 
ous with regard to age, (d) the control Ss 
were told at the time of the posttest that the 
effects of a course in 

rolled, introductory psychology, were being 
studied. This was an attempt to produce a 
similar test-taking attitude to that of the 
therapy Ss who knew that therapy was be- 
ing evaluated when they took the posttest. 


groups were 


which they were en- 


Procedure for Obtaining Control Data 


were collected over a 
vear of the 
tained for the 42 ther 
apy cases obtained during the first year. No control 
cases for the second-year therapy Ss were collected, 
so the N for all contr 
Early in the fall 


1% introduct 


Data for the therapy cases 
two-year period. Durir I 


study a control group ' 


s cond 


yup comparisons is 42 

of the second year of the 
chology students signed 
xperiment, which was a 
lirement, a vere CPI. The 
The CPI 
plotted on transparent 
paper and matched visually with the profiles of the 
therapy Ss. The 1 S of the same sex and with 
the most similar CPI profile wa ected as the 
ntrol case for each of the 42 
] 


») participate in 


given the 
small 


were administer: 


profiles for these 183 Ss were 


groups 


contr« 


matched cc 


In those instances where s 


therapy Ss 
Ss had simi- 
with the 


was Se- 


eral contr 
ir profiles to given therapy S, the one 
mallest D score (Cronbach & Gleser, 1953) 
lected. This prox 
ing in 


Ided reasonably 


Since the 


edure vie good match- 
therapy and control 
CPI change 
would have been desirable to match the 
groups on the basis of factor scores rath 
files. However, the control group had to be col 
lected before the factor analysis was done so this 
was not possible eason y good matching on fa 


every case 
groups were to be compared on the two 


‘ 


jactors it 


iat 


r than pro- 


TABLE 5 


IniT1AL CPI Factor Scores 
AND CONTROL GROUPS 


COMPARISON OF 
oF THERAPY 


Therapy Control 


Mear SD Mean SD 


7.2 9.7 


49.8 9.1 


tor scores was obtained by the procedure used. The 
mean pretest factor scores for Factors A and C (the 
two CPI factors) of the therapy and control groups 
are shown in Table 5. Although fairly good matching 
of therapy and control pairs was obtained, this table 
shows a nonsignificant tendency for the control group 
to have higher scores 

The 42 were called in indi- 
vidually for retesting after a period of time equal to 
that between pretest and posttest for the correspond- 
ing therapy S. At this time the control Ss were told 
that the Es were evaluating the effectiveness of the 
course in bringing about 
retested with the 
given a slightly altered form of the 
Table 1. References to the 
nged to make them appropriate to 
the introductory psychology course. Throughout the 
testing of the control group an attempt was made 
to structure the testing situation in a manner as 
similar to that of the therapy group as possible. 

Factor scores for the two CPI factors were ob 
tained by calculating the mean standard score change 
obtained from the CPI profile form for the highest 
loading variables on each factor. Thus, the factor 
score for Factor A is the mean standard score change 
for Scales Wb, Re, Sc, To, Gi, and Ac; and for 
Factor C is the mean standard change for 
Scales Do, Cs, Sy, Sp, and Sa. The factor score for 
Factor D, client rating, is the mean rating for Scales 
1, 3, and 4 of the rating scale shown in Table 1 


matched control Ss 


introductory psychology 


personality changes. They 
CPI and 


rating scale 


were 
were 
shown in 
clinic were cha 


score 


TABLE 6 


AND CONTROI 


3 33% 
4. 39** 


19.35** 


Groups ON CHANGE FAcTo 


AND D 


Difference between 
therapy and cor 
Control trol (f 


SD “est Means' 


4.79 
5.66 
0.53 


0.56 
2a0°" 
6.18** 
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TABLE 7 


COMPARISON OF THERAPY 
INDIVIDUAL CLIENT 


Therapy 
N= 42 


Factor A 


Wb Sense of well-being 
Re Responsibility 

Sc Self-control 

To Tolerance 

Gi Good impression 


Ac Achievement via conformance 
Factor C 

Do Dominance 

Cs Capacity for status 

Sy Sociability 

Sp Social presence 
Sa Self-acceptance 
Factor D (client ratings 

Symptoms 

Feeling 

Overall 


Not Scored on Any Factor 


CPI Scales 
So Socialization 


Cm Communality 


aS 
wees 


Ai Achievement via independence 
Ie Intellectual efficiency 

Py Psychological mindedness 

Fx Flexibility 

Fe Femininity 


~~ > 
nite SS 


x 


4 
? 
3 
g 
3 

2 

2 


Client Rating 
Understanding of Self 


Comparison of Therapy and Control Groups 

The change scores for therapy and control 
groups on the three factors to be compared 
are shown in Table 6. This table shows that 
both therapy and control groups show signifi- 
cant changes on all three factors, all in the 
direction of improvement. When therapy and 
control groups are compared the therapy 
group shows significantly greater improvement 


than the control group on Factors C and D. 


AND CONTROL CHANGE SCORE 
RATINGS 


AND CPI ScALEs 


Test of difference 
between therapy 
a control 


Variances' 


Means' 


0.48 
1.20 
2.08* 
3.38** 
4.66** 


0.78 


1.20 

3.08** 
1.31 
1.80 
1.60 


Other studies (such as Cartwright. 1956) 
have found therapy groups to have greater 
variance on change measures than control 
groups. This is also true of the present ther- 
apy and control groups on all three factors, 
but these differences are not statistically sig- 
nificant unless one is willing to use a one 
tailed test for this comparison. 

A comparison of therapy and control groups 
on the individual 
Here it can be 


scales is shown in Table 7. 
seen that the significant dif- 
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ferences between therapy and control groups 
on Factors C and D are due to consistent dif- 
ferences in all of the scales loading on these 
factors. This supports the contention that it 
is on the underlying dimension common to 
all scales loading on the factor that change 
occurs. This consistency is not found in the 
change scores loading on Factor A. The con- 
trol group shows greater change than the ex- 
perimental group on most of the scales load- 
ing on Factor A, but significant 
change in the wrong direction on one scale, 
To. The inconsistent performance of the To 
change scores in the control group is puzzling, 
but a recheck of the data indicates that it is 
not due to recording or calculating error. The 
change scores with no high loadings on any 
factor (for the most part scales with low com- 
munalities in the factor analysis) show no sig- 
nificant differences between therapy and con- 
trol groups. This suggests that the dimensions 
in the measures used in this study that are 
affected by psychotherapy are pretty well ac- 
counted for by the factors isolated 


shows a 


Some Implications for Research Design 


It may be appropriate to mention at this 
point the great clarification of the therapy 
and control comparison that 
about by the factor analysis. The CPI data 
shown in Table 7 were available before the 
factor analysis was done, and without the or 
ganizing influence of the factors. the data 
showed the confusing result of the therapy 


was brought 


group improving more on Cs and To and the 
control group improving more on Sc and Gi. 
Since the number of comparisons made was 
relatively large, it was difficult to evaluate 
the statistical reliability of this pattern of 
differences. However, when the scales are 
grouped according to their factor loadings as 
in Table 7, 


The control group changes shown in Table 7 


an orderly pattern emerges.‘ 


indicate the necessity of using a control group 
in the evaluation of psychotherapy change. It 
cannot be 


said on the basis of the present 


‘Harrison Gough (personal con 


pointed out that th 


munication) has 
ng of CPI scales brought 
about by the factor analysis was already implicit in 
cales described in the manual 
The first cluster is very similar to our Factor D and 
the second is very similar to our Factor A 


the four clusters of 
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data whether the control group changes are 
due to a general optimism leading to im- 
proved self-ratings or whether they represent 
regression effects due to the selection of the 
more maladjusted cases by the matching proc- 
ess. However, the marked tendency of the 
therapy cases with the lower test scores to 
show the greatest improvement, which was 
discussed above, suggests that it is necessary 
for an adequate control group to be matched 
with the therapy group on the criterion meas- 
ures to be used. 


DISCUSSION 


The above results appear to show that psy- 
chotherapy is effective in producing person- 
ality changes that are reflected in changes on 
certain CPI scales and in behavior discernible 
by both therapist and client. However, the 
factor analysis has shown that these changes 
occur along independent dimensions. 
Most investigators seem to assume that psy- 
chotherapy has a unitary effect in improving 
the adjustment of the client, and that lack of 
agreement among criteria is due to error in 
measurement of some sort 


three 


The present find- 
ings can be interpreted in a way that is con- 
sistent with this assumption of unidimensional 
change. Therapist and client ratings and CPI 
measures of sociability may all be sensitive to 
a unitary change in adjustment brought about 
by therapy, but each of these types of meas- 
ures may in addition be affected by special 
biases or other errors which are responsible 
for the finding of independent factors. If this 
were the case, one would expect to find some 
correlation between the change scores on the 
three factors due to the common influence of 
the personality change in a group receiving 
therapy. An 
matrix of 


correlation 
reveals that client 
ratings had a low positive relationship with 
both the test score changes loading on Fac- 
tor C and the therapist ratings. But the thera- 
pist ratings and test score changes were not 
correlated. the client ratings show 
similar correlations with the test score changes 
loading on Factor A and with the other scores 
that do not change with therapy. This sug- 
gests that the relationship between client rat- 
ings and the other 
mon biasing effects 


inspection of the 
change 


scores 


However, 


variables is due to com- 
rather than the influence 
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of a general psychotherapy change. The al- 
ternative interpretation of the present results 
is that the effects of psychotherapy are multi- 
dimensional. In view of the lack of correla- 
tion among change scores shown to be due to 
therapy, this interpretation seems somewhat 
more consistent with the data. 

If the effects of psychotherapy are multi- 
dimensional, the psychotherapy itself must 
have several independent aspects which have 
differential effects on the various criterion 
measures. For example, if therapist ratings 
and test score changes both reflect improve- 
ment due to therapy, and yet they are un- 
correlated in a group receiving therapy, they 
must be due to different aspects of the therapy. 

The CPI change (Factor C) differentiates 
therapy and control groups, but does not ap- 
pear to be related to either amount or in- 
tensity of the therapy as represented by num- 
ber of interviews and skill of therapist. This 
suggests that the mere fact of having entered 
into a therapeutic relationship is sufficient to 
produce this change. The first association one 
is likely to have to this is that perhaps the 
CPI change is a manifestation of the tend- 
ency to present oneself as in need of help at 
the beginning and as not needing help at the 
end of therapy (the old hello-goodbye effect). 
There are, however, two findings that suggest 
that the CPI changes represented by Factor 
C are not due to increased tendency to pre- 
sent oneself in the best light at the end of 
therapy. The first is that therapy and control 
groups did not differ on Factor A, which is the 
factor that should be most sensitive to changes 
in tendency to present oneself in a good light. 
The second is the Es’ nonquantitative ob- 
servation that the therapy Ss did not feel any 
particular obligation to the clinic. The ther- 
apy group was much more difficult to sched- 
ule for posttests than the control group and 
many more of the therapy Ss missed posttest 
appointments and expressed resentment at the 
imposition than was the case with the gen- 
erally cooperative control Ss. 

Thus, the CPI change with therapy seems 


on the one hand to be an attitude change 


brought about by simply being in therapy, 
while on the other hand, it does not appear 
to be simply a change in tendency to present 


Robert C. Nichols and Karl W. Beck 


oneself in the best light. Perhaps the CPI dif- 
ferences between therapy and control groups 
are due to differences in the changes expected 
by the Ss to accrue from therapy and in- 
troductory psychology. Or, alternately, per- 
haps they reflect genuine personality changes 
brought about by therapy. The latter inter 
pretation would necessitate the assumption 
that the supervisors’ ratings do not corre- 
spond to the therapists’ ability to produce this 
particular change 

Therapist and client ratings were both re- 
lated to the skill of the therapist as rated by 
the supervisor, and the difference between the 
correlation of these two variables with num- 
ber of interviews was not They 
were also significantly correlated with each 
other. (Correlations ranged from .23 to .36 
on corresponding scales.) Thus, the assump- 
tion that they independent 
changes is not as well justified as in the case 
of the CPI changes. The of higher cor- 
relation between them is probably due to dif- 
ferent biases inherent in the different points 
of view. For example, data not presented in 
this study indicate that the therapists tended 
to rate as improved those with high (well ad- 
justed) initial CPI suggesting that 
their rating of change may be influenced by 
the general level of adjustment, while this was 
not observed in the client ratings. 

One possible interpretation of the relation- 
ship of client and therapist ratings with skill 
of therapist is that therapist, client, and su- 
pervisor may all be reacting to the ability of 
the therapist to form pleasant 
tionships. However, if 
clients would be ex 
presenting themsel 


significant 


are reflecting 


] 


lack 


scores, 


lasting rela- 
this were the case the 
pected to show evidence of 
ves in a good light in mak 
ing the favorable ratings that produced the 
very from the control 
group on Factor D. Evidence discussed above 
suggests that this was not the case, thus it 
appears more likely 

client are rating real 
the client. 


significant difference 


that both therapist and 
personality changes in 


SUMMARY 


A number of measure 
chotherapy including 
client ratings and 
scales were factor 


of change with psy- 
several therapist and 
hange scores for 18 CPI 
nalyzed, using 75 therapy 
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cases as subjects. The six factors obtained 
from this analysis were largely identified by 
the measuring instruments used. Of the six 
factors four showed mean changes represent- 
ing significant improvement. The effects of 
therapy on the various change factor scores 
were studied by comparison with a matched 
control group and by correlating the change 
factor scores with number of interviews and 
with rated skill of therapist. These compari- 
sons indicated that psychotherapy has signifi- 
cant effects as measured by therapist and 
client ratings of change and by change in a 
group of CPI scales reflecting poise and inter- 
personal effectiveness. 
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A CROSS-VALIDATION OF THE HOUSE-TREE-PERSON 
DRAWING INDICES PREDICTING HOSPITAL 
DISCHARGE OF TUBERCULOSIS PATIENTS 


ALDO SANTORUM 


Veterans Administration Center, Martinsburg, West Virginia 


In an earlier article, Vernier, Whiting, and 
Meltzer (1955) reported certain signs ob- 
tained from the house and person drawings of 
the House-Tree-Person (H-T-P) Test which 
differentiate significantly the mHB (maximum 
hospital benefit) discharged patient from the 
AMA (against medical advice) discharged pa- 
tient upon his admission to a Tuberculosis 
Service. They found significant differences be- 
tween MHB and AMA groups on 7 of 14 test 
signs selected for analysis (see Table 1, Orig- 
inal Data). It was concluded by the authors 
that: 


Despite the small samples available for analysis, a 
sufficient number of reliable differences in test scores 
were found to permit accurate prediction for individ- 
ual patients. The data for one of the tests, the draw- 
ing of the house, justified the development of a spe- 
cific index. 


PROBLEM 


Because of the theoretical implications as- 
cribed by the authors to the predictive signs, 
and because of their possible use in planning 
therapeutic programs for TB patients, the 
need for cross-validating the original results 
was deemed essential. The present study, then, 
represents an attempt to provide cross-valida- 
tion information on comparable groups of TB 
patients. In fact, our second cross-validation 
sample represents, in part, the same patient 
population from which their original sample 
was drawn. 


METHOD AND RESULTS 


H-T-P Test protocols were obtained from a 
total of 397 patients admitted to our Tubercu- 
losis Service during the period of December 


1954 through June 1956. For purposes of 
analysis 397 “house” drawings and 383 “per- 
son” drawings were available. One hundred 
thirty-six of these patients received eventual 
AMA discharges, while 261 patients were dis- 
charged muB. Because the Ns of both groups 
were greater than 100, the formula recom- 
mended by McNemar (1949, p. 76) was 
utilized to test the significance of the differ- 
ences between groups on each of the 14 test 
signs. 

As noted in Table 1 (First Cross-Validation 
Sample) none of the differences between the 
groups reported significant in the earlier study 
was found in this current analysis. Indeed, a 
reversal in one of the signs, significant at the 
05 level, was noted in the current sample 
The findings suggested the need for further 
corroborative efforts 

Consequently, the H-T-P protocols of a 
second sample of TB patients admitted during 
the period January 1953 to January 1954 
All available patients receiving 
aMA (N =74) or MHB (N =77) dis 
With Ns below 100 in both 
groups, McNemar’s (1949, pp. 76 


i 


were studied 
either 
charges were used 
77) recom- 
mended formula for “small sample” differences 
rule-of-thumb 
when it is safe to 
compare groups by the D/op technique were 
Again, as Table 1 (Second 
Cross-Validation Sample) no significant dif- 
ferences between the AMA and MHB test pro- 


between proportions ind the 


criterion which indicates 


used noted in 


iocols were found 
As a further step, the statistical findings of 


The 


sample differences and the 


the original study were re-evaluated. 
formula for small 
rule-of-thumb criterion again were employed. 


Significant differences between the two groups 
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TABLE 1 


COMPARI 


Original Data 


AMA MHB 
V=50), (N=30 


Door on tel 


Door details, 2 or 1 


Window 


detail dr 
Numt 


Smoke from chi 


Walk present 


steps drawn 


nail side of house 


side of pag 


were found on six of the seven test signs 
originally reported as significant. However, as 
noted in the Reanalysis of Original Data 
section of Table 1, four of these differences 
did not meet the rule-of-thumb criterion, sug- 
gesting their very limited predictive value. 
The two remaining signs, which were reported 
as significantly different, did fulfill the rule- 
of-thumb criterion. However, as noted earlier. 
these significant differences 


were not con- 


firmed in our two cross-validation samples. 


ON OF AMA AND MHB GROUPS ON THE Hovuse-TrEE-PERSON TEST 


Reanalysis 
of Original 
Data 


Second Cross 
Validation 


Sample 


First Cross- 
Validation Sample 


0.6 


3.39"* 


DISCUSSION 


It is difficult to account for the large dis- 
crepancies between results of the original 
study and the results obtained in the cross- 
validations. Investigation of the character- 
istics of the original and current patient sam- 
ples revealed minimal differences between the 
groups with age, educational 
In addition, at 
least 50% of the second cross-validation sam- 
ple included patients used in the original 


regard to sex, 
level, and personality traits 
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study. Since the scoring method is rather clear- 
cut, the scoring of the data does not appear 
to be an area which contributed significantly 
to the obtained large discrepancies. 

Had the authors of the original study ap- 
plied McNemar’s rule-of-thumb criterion a 
more cautious interpretation of the results of 
the original study would have been offered. 


SUMMARY 


The H-T-P protocols of two samples of 
patients who had obtained either mupB dis- 
charges or AMA discharges were analyzed in 
an effort to cross-validate the findings of an 
earlier study (Vernier, et al., 1955). None of 
the seven signs of the original study which 
were reported to differ significantly between 
the groups was found to hold up in the cross- 
validation. 


Aldo Santorum 


The seven signs reported in the original 
study as significantly different were rean- 
alyzed by appropriate statistical techniques; 
five of these signs were found to be statis- 
tically untenable. 

The results of the present study suggest 
that the implications reported in the original 
study for projective test theory and for the 
problem of AMA and mus behavior prediction 
should be utilized with extreme caution. 
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CHANGES IN RORSCHACH PERFORMANCE AND 
CLINICAL IMPROVEMENT IN SCHIZOPHRENIA’ 


ROSALINE GOLDMAN ° 


Veterans Administration Regi 


The concept of withdrawal as descriptive 
of the behavior of the schizophrenic patient 
(Bleuler, 1916; 1944; Henderson & 
Gillespie, 1950; Masserman, 1946; Strecker, 
Ebaugh, & Ewalt, 1951) has been over- 
shadowed by the controversies and differences 


Coon. 


in opinion about the consequences, etiology, 


and mechanisms of schizophrenia. The com- 
plex behavior known as withdrawal involves 
the total organism and manifests itself in a 
variety of both in psychologic and 
physiologic behavior (Angyal, Freeman, & 
Hoskins, 1940). With clinical improvement 
following a schizophrenic episode, patients 
show greater interaction with others, more re- 
sponsiveness to the environment, and greater 
expression of emotion (Greenblatt & Solomon, 
1953) 


ways, 


These changes are evidence of a re- 
turn to “normal” behavior. We can use these 
changes in behavior to identify three areas 
in which withdrawal, so important in under- 
standing schizophrenia, occurs. These can be 
grouped as the areas of (a) interpersonal re- 
lations, (6) attitudes towards or perception 
of the world, and (c) the emotional life. Each 
area may be related to the special interests of 
a different psychiatric school. The Sullivan 
school concerns itself particularly with with- 
drawal from people. Meyer (Cameron, 1944) 
and Campbell (1935) emphasize withdrawal 
from the world (environment). The Freudian 
school withdrawal of emotions 
from external objects as libidinal energy is 
centered on or in the self 


emphasizes 


1 This study was carried out at the Boston Psy 
chopathic Hospital while the writer was a USPHS 
Fellow. The complete report of the study is on file 
it Boston University Library, as the writer’s unpub 
lished dissertation of the same title (1955) 

2 The writer is indebted to Chester C. Bennett for 


his invaluable critical assistance 


onal Office, 


Baltimore, Maryland 


In this study the assumption was made that 
the Rorschach test has satisfactory indices to 
reflect reactions in the areas of interpersonal 
relations, perception of the world and ability 
to deal with emotions, and that these indices 
are sensitive to changes within indivduals. 
The Rorschach indices reflecting responses in 
the three areas were derived from Rorschach 
theory and were not empirically determined 
single The indices were essentially 
those combinations of scores and ratios 
weighed and evaluated according to their 
theoretical significance by the clinician when 
he interprets a Rorschach protocol. 

The Rorschach scores reflecting the empa- 
thy, rapport, and feeling toward others and 
which were called indices of interpersonal re- 
lations are (a) the Movement and 
(M+ H) and (bd) the ratio of 
responses to the sum of color- 
form and pure color responses (FC:CF +C). 
The patient’s perception of the world was re- 
flected in four sets of Rorschach indices: (a) 
the perception of stimuli as others see them 
(F + %), (6) conformity with the thinking 
of the group (the number of Popular re- 
evidence that 
and danger (responses 
tension-laden 


scores. 


Human 
res| yonses 
form-color 


the world is a 
with 
content), and 
(d) distortion in the distance maintained by 
the individual between himself and the world 
(Rejections, Denials, Self-References). The 
expression and control of emotion were re- 
flected in four sets of Rorschach indices: (a) 


sponses), (Cc) 
fear 
and 


source of 


destructive 


the ratio of intellectually determined responses 
to the total number of responses (F%), (b) 
the affective energy available for response to 
external 10%), (c) emotional 
maturity and control (FC:CF:C), and (d) 
the ratio of the the emotional re- 


stimuli (8-9 


sum of 
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sponses which are controlled to those responses 
where control is secondary or lacking (FC 4 
FY + FV:CF+C+YF+Y+VF+YV). 


PROCEDURI 


The general hypothesis tested was that 
who show clinical improvement have significantly 
greater incidences of changes in the direction of 
healthier Rorschach responses than patients who do 
not show clinical improvement. Changes in the di 
rection of healthier responses as these are judged in 
Rorschach practice were taken as evidence of dimi 
nution of withdrawal. 

The Rorschach protocol of each patient which was 
obtained when he was hospitalized as acutely ill was 
compared with his protocol obtained just preceding 
his departure from the hospital, regardless of con 
dition at time of discharge. Predictions for the 
changes in each set of Rorschach indices accompany- 
ing changes in clinical condition were 

1. In the area of interpersonal relations, clinically 
improved patients would show changes in the direc- 
tion of a greater number of healthy movement and 
human (M+H) and a 
in the ratio of FC responses to the 
responses. 

2. In the dealing 


patients 


increase 


CF+C 


relative 


sum of 


responses 


area with perception of the 
world, the clinically improved patient would show 
changes in the F + % in the direction of the opti- 
mum range; the number of Popular responses would 
increase; Content would contain fewer signs of ten- 
sion; and there would be fewer Rejections, Denials 
Self-References 

3. In the area dealing 
control of emotion, patients 
provement wouid show change in the direction 
the optimum range in the F% and in the 8-9-10%; 
the FC:CF:C distribution would be away from the 
“regressive” shift; and the form-dominated responses 
(FC + FY + FV) would increase relative to the re- 
sponses where form is secondary or lacking (CF 
C+ ¥F + ¥ + VF + PV). 

Only cases diagnosed as having 
phrenic reaction, regardless of type 
of than six months’ and 
who were between the ages of 16 and 40 were in- 
cluded in this study. This approximates the rang 
Henderson and Gillespie (1950) give as the 
within which two-thirds of the schizophrenias have 
their onset. Patients with chronic alcoholism, epi 
lepsy, mental defect, physical handicaps from birth, 
psychosurgery, and those who were already 
going somatic therapy The 
age for the total years; 
education was 12.06 years of schooling. Only one pa 
tient in this study of age. The 
present hospitalization was his second, with a remis- 
sion period of more than five years. Two other pa- 
tients also had been hospitalized previ in 
each instance an interval of more than two years 
had elapsed between hospitalizations, during which 
time the patients had made economic and social ad 
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TABLE 1 


VALUES 
IN CLINICALLY 


All Improved 
( ases 


Rorschach Index V=3 


Attitude toward Peopk 


W+H 
FC:CF + ( 


Combined Indices 


Popular 
Reject 
Self-| 
Com! 


Rorschach changes in the total group of improved 
patients) compared 

dences of these changes in the 

(12 cases) 


cases (33 were with the inci 
unimproved group 
In the same of posi- 
tive changes in tite 14 showing 
provement, and the 19 cases sh 

provement compared with the 
these changes in the 12 wing no improve 
(Table 1, Columns 2 and A p value less 
O05 was the level set in this study for rejection 
of a null hypothesis. Since only direction of changs 
(positive, in the direction of “healthier” responses 
was studied, the p valu for the one-tailed test 
(Jones, 


way, the incidences 


Cases 


marked im 


wing im 


were incidences of 


cases sh 
ment 


than 


1952) were used 


RESULTS 


The significance of the incidences of posi- 
tive changes in the groups showing improve- 
ment when compared with the unimproved 
group for each index and the combined in- 
dices is given in Table 1. The greater dis- 
criminatory power of the combined Rorschach 
indices corroborates Rorschach theory and 
practice which holds that individual scores 
must be combined and patterned out for 
meaningful interpretation. Incidences of posi- 
tive changes in the entire group of 33 im- 


FOR SIGNIFICANCE OF CHANGES IN Rorscnacu INDICES OF WITHDRAWAI 
IMPROVED VS. 


UNIMPROVED CASES 


Cases Showing 
Marked Improvement Some Improvemen 


V=14 N = 19 


Cases Showing 


proved cases were significantly greater for 7 
of the 10 individual Rorschach indices when 
compared with the incidences of positive 
changes in the 12 unimproved cases. The cases 
showing marked improvement showed signifi- 
cant changes on 8 of the 10 indices. 
trast to this, the cases showing only “some” 
improvement showed significant changes on 
only 2 of the 10 indices. The greater clarity 
and higher instances of positive changes in 
the markedly improved group suggest that 
“pure” cases give findings more significant 
than definitive even though the 
number of cases in a sample is considerably 
reduced by using only the pure cases. 


In con- 


less cases 


DISCUSSION 


The Rorschach changes which most signifi- 
cantly differentiated between the improved 
and unimproved cases were seen in the re- 
versals of the regressive shift in the FC:CF:C 
“ladder.” The increase in the FC responses 
suggests that with clinical improvement the 
ego functions more effectively and there is a 
greater control over emotions. Use of the 
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qualitative changes in each patient’s record, 
such as changes from minus to plus signs in 
a given score, may have contributed to the 
high significance consistently shown for the 
association between the reversal of the re- 
gressive shift and clinical improvement. 

Cases evaluated as markedly improved clini- 
cally showed increased control of both emo- 
tional responsiveness and emotional tensions 
(increases in FC + FY + FV). The combined 
improved cases showed only a trend toward 
this change. The inference can be made that 
the more disturbing emotions indicated by 
shading and vista responses are especially 
difficult to deal with, since only the cases 
showing marked improvement can _ handle 
them effectively. 

The decrease in pathological content which 
was found to be significant in all improved 
groups suggests that the world becomes less 
hostile or barren to the clinically improved 
patient regardless of the degree of improve- 
ment he shows. The significant positive 
changes in the F + %, indicating greater 
ability to deal with reality, may be dynami- 
cally interlocked with the significant change 
found in decreased pathological Content. To- 
gether, these changes suggest that the world 
is “tested” more realistically by the patient 
because he is less afraid, and that he may be 
less afraid because of greater intellectual con- 
trol and ego strength. 

Two studies dealing with retest changes 
report negative findings when using Popular 
responses. Kelley, Margulies, and Barrera 
(1941) found fluctuations in the number of 
Popular responses given by amnesic patients 
when almost all other factors on retest were 
stable. Siegal (1948) found that Popular re- 
sponses “occurred diffusely” in patients show- 
ing improvement as well as in those failing to 
show improvement. Molish (1951) found that 
three-fourths of the failures to give Popular 
responses by schizophrenic patients occurred 
on cards which usually elicit human percepts. 
In light of the relationship found in the pres- 
ent study between the M + H index and clini- 
cal change, his findings suggest that we might 
have seen meaningful changes in Popular re- 
sponses if the analysis had been made in 
terms of their content rather than in terms 
of their total number in a protocol. 


Rosaline 


Goldman 


Although signs were not the 
concern of this study, relevant data are avail- 
able from the analysis of the initial test rec- 
ords of the improved and unimproved cases. 
These groups did not differ significantly, upon 
admission to the hospital, in the instances of 
healthy and unhealthy responses on any of 
the Rorschach indices used in this study. 
P+ % \ disturbed in the. 
patients who showed marked improvement 
later, but this index showed no trend toward 
discriminating between those cases who later 
improved and those who did not improve 
when the degree of improvement was undif- 
ferentiated 


prognosti 


slightly more 


was 


SUMMARY 

To study changes in withdrawal which oc- 
cur with clinical improvement, Rorschach in- 
dices of responsiveness to people, the environ- 
ment, and to the patient’s own affective life 
were selected. The indices reflecting activity 
in these three areas were derived from Ror 
schach theory and were not empirically de 
termined scores. The indices were essentially 
those combinations of and ratios 
weighed iluated according to their 
theoretical significance by the clinician when 
he interprets a Rorschach protocol. 

The Rorschach 45 cooperative 
acutely ill schizophrenic patients between the 
ages of 16 and 40 were studied. These cases 
were consecutive hospital admissions selected 
only on the 
cency of onset of illness. Each patient in this 
study was tested both when he was acutely ill 
and when he left the hospital as either im- 
proved or unimproved. Each individual was 
both an experimental subject and his own 
control; changes and 
protocols were analyzed. 

The findings in this study show that the 
Rorschach technique brings the functioning 


scores 


and ev 


records of 


diagnosis, age, and re- 


isis of 


between first second 


personality into focus so that the assumption 


that this instrument is sensitive to changes 
within an individual is justified. Combined 
indices were significantly related to 
clinical improvement than the separate in- 
dices, except for the FC:CF:C ratio. This is 
in accord with Rorschach theory and prac- 
tice which hold that individual must 
be combined for meaningful interpretation of 
a protocol. 


more 


scores 
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INTELLECTUAL FUNCTIONING IN TEMPORAL 
LOBE EPILEPSY 


OSCAR A. PARSONS 


Duke University and Veterans Admini 


Recent reports of impaired verbal abilities 
in psychomotor or temporal lobe epileptics 
have given new impetus to the continuing 
search for patterns of cognitive functioning 
associated with abnormal brain conditions. 
Quadfasel and Pruyser (1955) found that 
psychomotor epileptics had significantly lower 
Wechsler-Bellevue Verbal IQs in comparison 
to Performance IQs than did a group of 
grand mal epileptics. They further reported 
that memory for verbal material as measured 
by the Wechsler Memory Scale was signifi- 
cantly impaired for the psychomotor group 
while memory for designs was intact. Meyer 
and Jones (1957) confirmed the findings on 
the Wechsler-Bellevue in English patients and 
demonstrated that the deficit became greater 
subsequent to temporal lobe operation. Clini- 
cal evidence has mounted over many years to 
indicate that speech and communicative skills 
are associated with the temporal lobes, espe- 
cially that of the left or dominant hemisphere 
Since most psychomotor epileptics have left 
temporal lobe foci (Gibbs & Gibbs, 1952), 
the clinical and experimental findings of ver- 

1This report is based in part on a thesis sub 
mitted by the junior author in partial fulfillment of 
the degree of Bachelor of Arts with distinction at 
Duke University, under the direction of the senior 
author. The authors are grateful to Joseph B. Parker, 
Chief, Psychiatric Service, Durham VA Hospital; 
and Bernard Stotsky and Paul Daston, Chiefs, Ps 
chological Service, Durham VA Hospital, for thei: 
aid in carrying out this study. This research wa 
supported in part by a grant (B-1459) to the senior 
author from the National Institute of Neurological 
Diseases and Blindness, USPHS; and was carried 
out in conjunction with a larger VA study, The 
Social and Emotional Adjustment of Epileptics, un- 
der the direction of Joseph B. Parker, William Wil- 
son, and Lever Stewart 

2 Parsons is now at the Department of Psychiatry, 
Neurology, and Behavioral University of 
Oklahoma Medical Center 
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The above results not only bear on the 
question of and localization of 
brain functions but also seem to hold promise 
for the clinician who is faced with many 
perplexing diagnostic problems in connection 
with psychomotor epilepsy, e.g., differe 
from anxiety st 
pathic behavior, ete 


spec ificity 


ntiation 
psy¢ ho- 
warrant 


ates, schizophrenia, 


However. before 


may be given for generalization, the possible 


restrictions and conditions should be explored 
One such restriction has already been raised 
by Meyer and Jones (1957), who found that 
only left temporal lobe epileptics manifested 
the verbal deficits; right temporal lobe sub- 
jects (Ss) did not. On the other hand, Quad- 
fasel and Pruyser report the same Verbal- 
Performance in both right and 
left temporal lobe epileptics. Examination of 
previous studies suggested that variables such 
unilaterality bilaterality of 
“pure” psychomotor vs. psychomotor mixed 
with generalized seizure patterns have been 

purpose of the present study, 
ine those cognitive abilities 
Verbal and Performance 
Wechsler Adult Intelli- 
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then, 
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tern may exist for Ss with temporal lobe 
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sion, as measured by Picture Arrangement 
type tests; in Reitan’s (1957) proposition 
that the Similarities subtest of the Wechsler 
is particularly affected by temporal lobe dam- 
age and in Penfield’s stimulation studies 
(Penfield & Jasper, 1954) which have led 
him to postulate that the temporal lobes have 
an important function in memory, compari- 
son, and generalization. The specific pattern, 
then, may take the form of lower Similarities 
(generalization and comparison function), In- 
formation and Digit Span (remote and im- 
mediate memory), and Picture Arrangement 
(visual comprehension) for the psychomotor 
group in comparison to control (nonCNS 
damaged) Ss. 

Again, however, relevant variables must be 
considered. Degree of disturb- 
variable often studies of 
the brain damaged, seems pertinent. It is 
recognized that psychomotor epileptics mani- 
fest more personality disorders than other 
epileptic groups (Gibbs & Gibbs, 1952) and 
that personality disturbances may affect cog- 
nitive functions (Wechsler, 1958). It. would 
appear that a nonepileptic, nonCNS damaged 
control group, which was equated for degree of 
personality disturbance, would give the clearest 
test for patterns on the Wechsler associated 
with temporal lobe epilepsy. A 
seizure group who had foci in 


personality 


ance, a ignored in 


generalized 
the same area 
as the psychomotor group would provide a 
test for the uniqueness of any pattern found 
in the latter group. However, it might be 
more reasonable to expect that groups with 
similar localization of lesions 
EEG foci) would manifest 
of cognitive difficulties 
two predictions to be 


as defined by 
a similar pattern 
In summary, the first 
tested are: 

1. Psychomotor and generalized seizure epi- 
leptics with EEG foci in the same general 
area (unilateral) have lower Verbal than 
Performance Scale scores on the WAIS. 

2. These epileptic patients also have a pat- 
tern of cognitive functioning characterized by 
lower Information, Similarities, Digit Span, 
and Picture Arrangement scores, in 
comparison to a group of nonepileptic con- 
trols equated for degree of personality dis- 
turbance. 

A third purpose of this research is the in- 
vestigation of the effect of degree of abnor- 


subtest 
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mal EEG activity on the WAIS performance 
of the epileptic groups. Recent findings by 
Hovey and Kooi (1955) suggest that fluctua- 
tions in the level of responding (with conse- 
quent lowering of quality of response) on the 
Wechsler are related to the presence of sub- 
clinical paroxysmal EEG activity, the latter 
occurring in the absence of overt or clinical 
manifestations of seizures. It is anticipated 
that the relationships stated above will be 
more clearly established in those epileptics 
having a greater degree of abnormal EEG ac- 
tivity. 
METHOD 

Subjects 


The Ss in this study were 46 mak 
patients of the Durham VA Hospit al 
placed into three g 
ind Control, on the ba 


right-handed 
Patients wer 
Generalized 


Sl i criteria stated 


groups, Psychomotor 


below 
These three groups were equated for age, race, and 
education and were not significantly 
Full Scale WAIS IQs (Table 1) 
Fifteen Ss were placed in the Psychomotor group 
on the basis of three criteria: (a) the 


different on 


presence in 
two or more electroencephalograms of interictal epi- 
leptiform discharge localized in only one 
poral lobes 
with left 


of the tem- 
Ss with right temporal foci and 1¢ 
foci comprised this group); (6) the pres 
ileptic and only 
psychomotor in nature, as determined by clinical ob 
servation and/or histories elicited fr 
and relatives or acqi 


ence of seizures which were clearly e; 
ym the patient 
iaintances familiar with the pa- 
tient’s seizures; * and (c) an age of onset of epilepti 
symptoms of 20 years or older 

Sixteen Ss were selected for the Generalized group 
on the basis of 
cloni 


type ol (generalized, tonic- 
seizures) and the presence of focal epileptiform 
activity in two or more EEGs. Seven of these pa- 
tients had psychomotor seizures in addition to their 
grand mal symptoms. The foci of this group was 
although temporal lobe (14 
out of 16 patients): 5 Ss had left temporal foci, 2 
right temporal, bitemporal, 2 left fr 
1 left parietal-temporal, 1 right temporal-occipital, 
1 left frontocentral, and 1 right frontal pole 


of seizure symptoms also was after 20 


seizure 


mixed, predominately 


yntotemporal, 


Onset 
years of age 
The differences between the Psychomotor group and 
the Generalized group on mean age of onset and on 
duration of disorder are not significant (Table 1) 

All epileptic patients were classified as to degree 
of abnormal EEG activity. “Abnormal” was defined 
as the presence of any of the following: spikes, sharp 
waves, sharp and slow waves complexes, multiple 
The authors wish to thank William P. Wilson 
and Lever F. Stewart, certified electroencephalog- 
raphers, who furnished the diagnostic information 
reported in this study. The EEG criteria were based 
on Penfield and Jasper (1954) 
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TABLE 1 


DESCRIPTION OF THE CONTROL, PSYCHOMOTOR, AND GENERALIZED GROUPS 


Education 


Control 
(N = 15 
SD 
Psychomotor 
(N = 15 
SD 
Generalized 
(N = 16 


9.0 
3.4 


SD 


Note.—None of the differences an 


ng groups is significant 
spike bursts, and spike and wave discharges. Records 
were rated on a three-point scale ranging from 
minimal activity (occasional or rare occurrence) 
through moderate to marked (abnormal activity oc- 
curring almost all the time) by a board elecetro- 
encephalographer. In all other respects the neuro- 
logical examination of these patients was negative 

The Control group consisted of 15 patients who 
had no evidence of CNS pathology and who were 
referred to the Psychological Service for psychodiag- 
nostic testing. These Ss were equated with the Psy- 
chomotor group and Generalized groups on pertinent 
variables and for the degree and pattern of person- 
ality disturbance as manifested on the Minnesota 
Multiphasic Personality Inventory (MMPI). The 
MMPI 7 scores of the three groups, Control, Psy 
chomotor, and Generalized, do not differ significantly 
(Table 2). The collective profile is marked by an 
elevated neurotic triad and psychasthenia ahd schizo 
phrenia subscales. 


Procedure 


The Ss in the two epileptic groups were tested by 
the junior author on the WAIS, as part of a larger 
battery of tests. The WAIS administered ai 
cording to standard procedure (Wechsler, 1955) 


Full Scale IQ 


Duration 
of Disorder 


Age of Onset 


Symptoms 


Since another block design test was in the 
larger battery, the Block Design subtest was omitted; 
therefore, the Performance Scale and Full Scale IQs 
reported for these prorated. The MMPI 
(Group Form) was administered to these Ss as part 
of the routine intake procedure of the Psychological 
Service 

Control Group 
and the MMPI by 


chological service 


used 


‘ 
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Ss 


were 


Ss administered the WAIS 


arious examiners from the psy 


were 
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RESULTS 


standard deviations for the 
» IQs, Verbal Scale IQs, and 
are presented in Table 3. A 
I analysis of variance of the 
ale and Verbal Scale IQs re- 


The means an 
Performance Sc 
Full Scale IQs 
Lindquist Type 
Performance S 


ale 


the general class of brain injured, 
more likely to score lower on 

n than nonepileptic groups and thus 
reduce discrep between the expected lower 
Verbal than Performance Thus, omission of 
the B-D should favor the predicted higher Perform- 
ance Scale 


4 As members 
epileptic Ss would 
the Block Desig 


he 
he 


im 


ies 


scores 


TABLE 2 


Mean MMPI 7 scores OF THE CONTR 


56 
53 
50 


Control 
Psychomotor 
Generalized 


L, 


Hy 


73 
69 


PsYCHOMOT ) GENERALIZED GROUP 


Pt 


Note.—None of the differences among groups is significant as measured by U test 





Intellectual Functioning in Temporal Lobe Epilepsy 


TABLE 3 


MEAN PERFORMANCE IQ, VERBAL IQ, AND MEAN “‘PERFORMANCE MINUS VERBAL Scor! 
FOR CONTROL, PsyCHOMOTOR, AND GENERALIZED GROUPS 


nance IQ Verbal IQ Performance-Verbal Score 


Control 


X 
SD 


Psychomot 
xX 
SD 


276.8 
298.0 
204.0 

66.3 


16.9 


veals a significant trials effect. It is evident the scaled scores on Information, Similarities, 
from inspection of the means that this dif- Digit Span, and Picture Arrangement from 
ference lies in the two epileptic groups, the X of the scaled scores for the remaining 
both of whom have approximately the same jx subtests for each S. A positive score for the 
discrepancy, i.e., a lower Performance than epileptic groups means a difference in the pre- 
Verbal IQ which is opposite to the first pre- dicted direction. In the last column of Table 4 
diction. Comparison of the left temporal and the 


difference score is presented for each 
right temporal lesion cases in both epileptic 


group. The Generalized group differs signifi- 
groups indicates that the right temporal group cantly (¢ = 2.17, p < .05) from the Control 
(N = 10) have approximately the same Per- group in the predicted direction, i.e., having 
formance and Verbal IQs while the belt tem- a lower X Information, Similarities, Digit 
poral group (N = 16) manifests the discrep- Span, and Picture Arrangement score; the 
ancy. Psychomotor group does not. Examining the 


To test the second prediction, a difference pattern of subtest scoring reveals that the 


score was obtained by subtracting the X of weight of the difference between groups is 


TABLE 4 


CONTROL, Ps 


-neralized group significantly different from Controls 
neralized group significantly different from ntrols 


neralized group different from Controls at .10 < # 


t 
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TABLE 5 


MEAN AGE, 


Age 


Controls (V = 15 31.5 
Minimal (NV = 14 31.8 
Moderate and Marked (N = 17 36.1 


carried by Similarities (p< .01) and to a 
lesser extent by Picture Arrangement (.10 - 
p > .05). Information and Digit Span are not 
significantly different. 

The effect of degree of abnormal EEG ac- 
tivity was tested by comparing the fourteen 
Ss, from both the Psychomotor and General- 
ized groups, rated as having minimal activity, 
to the 17 Ss falling in the moderate and 
marked categories. In Table 5 the means for 
the control variables and Performance and 
Verbal IQs are noted for the three groups. 
It is evident that the Minimal group tends to 
be lower in education and the Moderate- 
Marked group is somewhat older. The latter 
variable, however, is not likely to produce an 
effect since both Verbal and Performance 
IQs are adjusted for age (Wechsler, 1958). 
In light of a near significant F (.10< p> 
.05) for differences between the two abnor- 
mal activity groups on education and the find- 
ing of a correlation (7) of +.59 between edu- 
cation and Performance IQ, an analysis of 
covariance was performed (Edwards, 1950). 
The result of this analysis for the Perform- 
ance IQ indicates a highly significant differ- 


TABLE 6 
ANALYSIS OF COVARIANCE OF PERFORMANCE 


AND VERBAL IQs in ABNORMAL EEG 
AcTIVITY GROUPS 


Performance IQ f US 


r 
Within 
Adjusted Mean 
Verbal IQ 


54.11 
1381.00 


rT 
Within 
Adjusted Mean 


> < 001 


EDUCATION, PERFORMANCE, AND VERBAI 
AND MODERATE-SEVERE 


Education 


IQs FOR 
ACTIVITY GRO 


L, MINIMAL, 


elQ Verbal IQ 
93.7 
91.3 


95.6 


the Marked-Moderate 
group having a much lower Performance IQ 
than would be expected for their educational 
level in comparison to the Minimal group 

Analysis of covariance for the Verbal IQ, on 
the other hand, revealed no differences. 


ence between groups, 


DISCUSSION 


The finding that the epileptic groups have 
lower Performance than Verbal IQs contrary 
to the first prediction, raises a number of ques- 


tions concerning generalization from previous 
findings. Lateralization of lesion could not ex- 
plain these results since the discrepancy was 
more pronounced in the left temporal focus 
group as opposed to the right sided focus pa- 
tients. Equating the Control group to the epi- 
leptic groups on degree and pattern of per- 
sonality disturbance, while admittedly only 
an approximation, could have contributed to 
between group differences but not to the ob- 
tained within group differences in the epi- 
leptics. In seeking to illuminate the possible 
cause for the discrepancy between our results 
and others, three variables appear important. 
First, the cortical functioning of our popu- 
lation of temporal lobe epileptics was prob- 
ably less severely disturbed than other groups. 
In contrast to the present group, the Meyer 
and Jones (1957) epileptics were candidates 
for temporal surgery. Quadfasel and 
Pruyser’s (1955) patients had a longer dura- 
tion of illness and earlier age of onset. While 
this might explain mo difference between 
groups, it could hardly account for the sig- 
nificantly lower Performance Scale IQs. 
Second, our population has a lower Full 
Scale IQ than the previous studies. To esti- 
mate the possible effect of this variable, the 
published data of Meyer and Jones (1957) 
were analyzed (preoperative scores). By tak- 
ing the Full Scale IQ as the best overall meas- 
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ure of the general level of intelligence, the 
left-sided lesion group of patients were di- 
vided at the median into High and Low IQ 
groups. The High IQ group (N = 10) aver- 
ages 9.8 points lower on the Verbal Scale 
relative to Performance IQs while the Low 
IQ group (V = 10) had a .2 difference. This 
difference between groups is significant at the 
.05 level by U test. Since the patients used by 
Quadfasel and Pruyser (1955) had an even 
higher average IQ than Ss used by Meyer 
and Jones and the present study, it may be 
the verbal deficit reported in preoperated 
temporal lobe epileptics holds only in high IQ 
groups. The complexity of this issue is em- 
phasized by the recent report of the preop- 
erative Wechsler-Bellevue Scores for left and 
right sided temporal lobe lesion groups by 
Milner (1958). In this study, using compara- 
tively high IQ groups, the preoperative Verbal 
and Performance IQs did not differ for the 
left-sided group, but the right-sided group had 
significantly lower Performance IQs! 

The third variable is the difference in test 
instrument. The previous studies used the 
Wechsler-Bellevue, while the present study 
employed the WAIS. Studies cited by Wech- 
sler (1958, pp. 114-117) indicate the possi- 
bility of Verbal-Performance IQ discrepancies 
with the WAIS in comparison to the Wechsler- 
Bellevue. The one study reporting both tests 
on the same 50 Ss, drawn from a university 
indicates that WAIS Per- 
formance IQs averaged 4.52 points lower than 
the Verbal Scale IQs. However, no difference 
existed between the WB Verbal and Perform- 
ance IQs. A second study found Performance 
IQs significantly lower (5.69 IQ points) than 
Verbal IQs in 15 


again 


counseling center, 


+ psychiatric patients, but 
no difference between the WB 
Verbal and Performance IQs in 392 compa- 
rable patients. Similar Verbal-Performance IQ 
discrepancies are reported for a sample of 130 
aged Ss on the WAIS 
Busse, & Cohen, 1959). 

The average Performance minus Verbal IQ 
difference in the epileptic groups in this pres- 
ent study is —4.65, almost the same as that 
noted To check the possibility that 
our epileptic Ss may not differ significantly 


found 


normal, (Eisdorfer 


above. 
from the general population of the hospital, 


the WAIS records of 100 patients under 60 


TABLE 7 
MEAN AGE, EpuUCATION, PERFORMANCE, AND VERBAI 
IQs oF 100 Mate Hospita.izep Susyects WITH 
out Dracnosis oF ORGANIC Brain DAMAGE 


Perforn 
Perfor Verbal ance 
ance IQ 1Q Verbal 


86.69 90.19 3.50" 


* Significant at the .01 level 


years of age who had no diagnosis of brain 
damage, and who had not been used in pres- 
ent study were selected at random from the 
Durham VA Hospital files. As indicated in 
Table 7, this group is comparable in age and 
education to the groups in the present study. 
The results are clear: the Performance IQ is 
3.50 points below the Verbal IQ, a significant 
difference (p < .01). The Performance-Verbal 
discrepancies of the two groups of epileptics 
do not differ significantly from this larger 
group. 

From these results it is concluded that our 
epileptic groups do not manifest the previ- 
ously reported preoperative Performance-Ver- 
bal IQ discrepancies. Further, should other 
investigators using different populations sub- 
stantiate the lower Performance IQ relative 
to Verbal IQ reported in the four studies 
cited above, the implication for the clinician 
is obvious: some readjustment in the concept 
of the magnitude of the discrepancy neces- 
sary before differences in those abilities rep- 
resented by these scales can be inferred. 

Confirmation of Prediction 2 in the Gener- 
alized group vs. Controls suggests that pat- 
tern of subtest performance may reflect the 
effects of cortical dysfunction of an epileptic 
nature in patients who may not be manifest- 
ing behavioral signs of epilepsy at the time of 
intelligence testing. It is of interest to note 
that the two subtests which discriminated 
best, Similarities and Picture Arrangement, 
have been identified by previous investiga- 
tors as susceptible to temporal lobe malfunc- 
tion (Milner, 1954; Reitan, 1957). 

Further, the effect of degree of abnormal 
activity is noteworthy (Tables 5 and 6). The 
Moderate-Marked Activity group had a much 
lower Performance IQ than Verbal while the 
Minimal Activity group had practically identi- 
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cal Verbal and Performance IQs. The pres- 
ence of subclinical neurophysiological dis- 
turbances may have a more telling effect on 
the timed tests constituting the Performance 
Scale. Indeed, ail the Performance subtests 
were lower for the Moderate-Marked group 
than the Minimal Activity group despite the 
former’s educational advantage. Such results 
are in accord with the impaired attentiveness 
and deficits in sustained serial activity re- 
ported in similar groups by Hovey and Kooi 
(1955); Morrell (1956); and Rosvold, Mir- 
sky, Sarason, Bransome, and Beck (1956). 


SUMMARY 


Patterns of cognitive functioning on the 
WAIS were studied in three groups of Ss. 
Fifteen psychomotor (temporal lobe) epilep- 
tics with unequivocal unilateral EEG foci; 16 
generalized seizure patients with temporal 
lobe foci; and 15 nonCNS damaged Con- 
trols. Ss were matched on relevant variables 
and were similar in degree and pattern of 
personality disturbance (MMPI). Contrary 
to previous reports, the Psychomotor group 
does not manifest a deficit in Verbal IQ rela- 
tive to Performance IQ. The Generalized 
group is significantly lower, as predicted, on 
a combined Information, Similarities, Digit 
Span, and Picture Arrangement Score. Sig- 
nificantly lower Performance IQs are found 
in a moderate and very active abnormal EEG 
activity group compared to a minimal ac- 
tivity group, regardless of seizure pattern. 

From data based on 160 hospitalized Ss in 
this study and studies cited by Wechsler, it 
is suggested that, unlike the Wechsler-Belle- 
vue, the Performance IQ on the WAIS tends 
to be three to five points lower than the 
Verbal IQ. 


Oscar A. Parsons and David E. Kemp 
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ON THE BREAKDOWN OF 


THE SENSE OF 


REALITY: 


A COMMENT 


PHILIP A 


GOLDBERG 
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Hozier (1959) has attempted to answer a 
difficult question. Accepting the breakdown of 
reality as the central fact of schizophrenia, 
she hypothesizes that, “the loss of the sense 
of reality in the schizophrenic individual in- 
volves a breakdown in the bodily self as a 
consequence of insufficient cathexis” (p. 186). 
As a test of this hypothesis Hozier investi- 
gated the spatial perception of schizophrenics 
by means of a Figure Placement task, a Doll 
task, and a Draw-a-Person task. Her results, 
which she accepts as confirming her hypothe- 
sis, show that schizophrenics make signifi- 
cantly more errors on these tasks than nor- 
mal controls. 

Unfortunately for the clarity of her results 
there are methodological and theoretical diffi- 
culties which she has either overlooked or too 
cavalierly dismissed. There has been much 
work reported in the literature to indicate 
that the generally poor performance of schizo- 
phrenics can, in large part, be attributed to 
motivational variables. Cavanaugh (1958) 
found this to be the case with concept for- 
mation tasks, and Lang (1959) obtained 
similar results investigating reaction time. 
Lang (i959) believes that “Shakow’s concept 
[1950] of an inability to maintain set appears 
to fit the 267). Hozier men- 
tions Shakow’s concept but dismisses it on 
the grounds that her experimental group con- 
sisted of patients judged to be cooperative by 
their ward psychiatrist. 

One has no way of knowing or evaluating 
the adequacy of the ward psychiatrists’ rat- 
ings of cooperativeness. Hozier does not tell 
us the procedures used to obtain these rat- 
ings, nor does she specify the variables that 
entered into the determination. One might 
speculate that the demands of custodial care 
might lead ward psychiatrists to equate trac- 
tableness with cooperativeness. Without pur- 
suing this speculation, it would seem fair to 


data” (p 


say that when discussing schizophrenic per- 
formance, problems of motivation and set are 
too important for the reader to be confident 
that Hozier’s solution is adequate. 

The literature demonstrating schizophrenic 
deficit is too voluminous to need citation here. 
Hozier did not merely attempt to indicate an- 
other area of deficit; her attempt was to 
demonstrate the relationship between reality 
loss and “the spatial problem of dealing with 
the body ... and the relationship of the 
body to the world” (p. 194). But how would 
her schizophrenics perform in other task situa- 
tions that do not specifically involve such 
spatial perception of the body? It is pos- 
sible, perhaps likely, that they would show 
deficit. But whatever the case, Hozier’s study 
does not have this control and the reader 
must question the correctness of her conclu- 
sion, which relates the schizophrenics’ im- 
paired performance to the specific problem 
of the spatial perception of the bodily self. 
Hozier has arbitrarily explored a single facet 
of a multifaceted problem to emphasize bodily 
self as the primary model of reality. A more 
parsimonious explanation of her data might 
be formulated in terms of attitudinal and at- 
tentional variables presumed to operate in a 
wide range of task situations. 

Hozier does raise tangentially this ques- 
tion of confounding when she states, “what 
part other variables may play in accounting 
for the significant results cannot be deter- 
mined by the present study” (p. 193). But 
here it seems her primary concern is with the 
adequacy of her task situations. She dis- 
misses the question even before she raises it 
when she says, “certainly, the tasks are re- 
lated to spatial relations and organization” 
(p. 193). Accepting on faith the adequacy 
of her tasks, Hozier’s recognition that other 
variables might play a role in her results 
emphasizes the point made above. It is not 
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necessary to question that some relationship 
exists between her tasks and spatial organi- 
zation; it is.necessary to question the con- 
trol of the “other variables” that may also 
be related to the tasks and which might have 
made Hozier’s results significant. “Other vari- 
ables” always exist potentially, and no one 
has the right to demand of an experimenter 
that he exhaustively anticipate these vari- 
ables. However, when the accumulated re- 
search of other workers implicates particular 
classes of variables as being significantly re- 
lated to a phenomenon, it is the responsibil- 
ity of the experimenter investigating this phe- 
nomenon to control and account for these 
variables. 

There are other questions that might be 
raised about this study. Hozier states that pa- 
tients who were receiving insulin coma or elec- 
tric shock treatment were excluded from her 
experimental group. Presumably this was done 
to insure that her subjects’ (Ss’) performance 
would not be hampered by extraneous fac- 
tors. But it is widely recognized that one of 
the major obstacles in doing research with 
hospital patients is the difficulty in getting 
patients who are not “on drugs.” Hozier does 
not report whether the schizophrenics in her 
experimental group were receiving drugs. If 
they were, we can only speculate as to the 
probable effect on performance. 

Another question might be raised about 
Hozier’s use of one-tailed: tests. Goldfried 
(1959) would question her criterion for the 
selection of a one-tailed test, carrying with 
it the possible increase in the level of signifi- 
cance. Nevertheless, this issue is controversial 
and though this writer would agree with Gold- 
fried, it is perhaps not a crucial criticism. 
However, other criticisms already made do 
seem to be crucial. 

But it would be unfair and scientifically un- 
profitable to ignore the value inherent in 
Hozier’s work because of what seem to be 
certain significant weaknesses. Those features 
believed to be methodologically weak are not 
intrinsic to the theoretical problem at issue. 
Had the Ss been asked to draw three dimen- 
sional geometric figures in addition to draw- 
ing a person, the study might have been 
tightened considerably. Such a task would 
presumably not involve “the spatial problem 
of dealing with the body,” but it would ap- 
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pear to tap essentially the same motoric and 
spatial skills. Thus, it might provide one 
needed control of set variables. Certainly this 
is just one possible way, and undoubtedly not 
the best, to make Hozier’s study a more 
rigorous investigation 

Hozier has proceded with an_ interesting 
theoretical approach to attack imaginatively 
a difficult problem. This is not 
the traditional with which 
clude their papers 
criticizing because of 


meant to be 


sop critics con- 


Hozier’s study is worth 
its fundamental value. 
Should a more tightly designed study along 


essentially the same lines as Hozier’s work 
obtain similar findings, its implication for 
ego psychology and for an understanding of 
schizophrenic processes would be of undeni- 
able worth. Additionally, such a study might 
help to resolve the controversy over the na- 
ture of schizophrenic deficit. It has been sug- 
gested that schizophrenic deficit represents a 
withdrawal response to threatening stimuli 
(Rodnick & Garmezy, 1957). Cavanaugh 
(1958), Lang (1959), Hunt and Cofer (1944), 
and others have stressed the motivational 
variables involved in schizophrenic deficit. In 
short, we have yet to understand why the 
schizophrenic behaves as he does. Hozier’s 
work expands a new approach to an old 
question. We need new approaches 
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A REPLY TO GOLDBERG 


ANN 
Veterans Administration 


thoughtful and careful 
review of my article, raises some cogent criti- 


Goldberg, after a 
cisms. His main and perhaps his most crucial 
criticism is of my failure to control for moti- 
vational variables. In support of this criticism, 
he cites experimental work which concludes 
that the poor performance of the schizo- 
phrenic can be largely accounted for by 
motivational variables. Attractive as this ex- 
planation may be, I see it as being tautologi- 
cal, in that by characterize 
schizophrenics as being motivationally im- 
paired. They are apathetic, withdrawn, dis 
interested, distractible, lacking in goal direc- 
tion, etc. If we attempt to account for their 
behavior in terms of 


definition we 


motivational variables 
we are only attempting to affirm our defini- 
tion. Such an interpretation is qualitatively 


similar to saying that a mental defective’s 
lack of intelligence hampers his performance 
Perhaps the kind of question we should ask 
is what underlies the impairment of the abil- 
ity to maintain set and attend? Do motiva- 
tional variables constitute the necessary and 
sufficient conditions for developing a stable 
sense of reality? Our hypothesis is that the 
ability to attend, to direct one’s energies to 
a task, to make adequate use of stimuli, to 
think abstractly, etc., is dependent upon an 
adequate differentiation of body/world or, in 
our terms, on a stable, well-cathected bodily 
self. As long as this differentiation is not 
clearly delineated, then true subject Jbject 
relations cannot exist on istent 
level. Motivated behavior, in its true mening, 
is predicated on the capacity of the s' oject 
(S) to relate, to direct himself to an object. 
We assume that the schizophrenic is incapable 
of sustained motivated behavior. That is, he 
does not experience himself as an S in rela- 
tionship to an object with any consistency as 
a consequence of insufficient cathexis of his 


a stable cos 


Vent 


HOZIER 
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boundaries. He does have affective 
cathexis available to him, but it is attached 
to various and sundry, isolated things. 

I assumed that the schizophrenics would 
have no interest in the tasks—that 
they would not be motivated because of the 
very nature of illness. The word co- 
operative, rather than motivated, was used 
to indicate that the interest of the schizo- 
phrenic was solicited rather than spontane- 
ously given. I further assumed that coopera- 
was not a global trait. We all 
that the schizophrenic, like 
people, will be cooperative with one 
and not with another or will do one task and 
not another at request. The psychiatrists were 
asked to select patients who met the selection 
(Hozier, 186) and whom 
they judged would be cooperative with the 
investigator in doing the particular tasks. The 
psychiatrists were aware of the nature of the 


bodily 


vested 


their 


tion have 


observed most 


person 


criteria 1959, p. 


tasks, but were not aware of the predictions 
They selected 36 patients, some of whom re- 
because of their 
destructive and assaultive behavior. Other pa 
tients resided on an open or semiopen ward 
Some of these patients were either already 
working in the community or in the process 


quired maximum supervision 


of obtaining positions outside of the hospital. 
The records of the 36 patients were reviewed, 
and it was found that 3 of them did not meet 
one of the selection criteria. The patients were 
seen in the order of their appearance on the 
list. Of the 26 Ss contacted 
of 1 S could not be obtained. 
Are clinical evaluations of cooperativeness 
Did the investigator fail to see 
evidences of uncooperativeness due to her 
involvement in the research? Unfortu- 
nately, an unequivocal answer to these ques- 
tions cannot be given. It is the investigator’s 
impression that the schizophrenics’ failures or 


the « ooperation 


adequate? 


own 
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inadequacies in dealing with the tasks were 
not largely a consequence of their unwilling- 
ness to attempt to deal with them. 

Goldberg raises an important question when 
he asks how would schizophrenics perform on 
tasks not specifically involved in the spatial 
perception of the body. Theoretically, as long 
as the body is not sufficiently cathected to 
become bounded and differentiated from 
everything that is not the body, there exists 
no frame of reference from which to judge 
the reality of events in the external world or 
the reality of one’s own psychologica! experi- 
ences. The conventional dimensions of space 
are body dimensions. What is called up or 
down, to the left or right, or to the front 
or behind does not depend on the position of 
the object, but on the position of the body 
in relationship to the object. Thus, we would 
hypothesize that if the boundaries of the body 
are not sufficient’’ cathected, then there will 
be a disturbance in spatial perception of non- 
human as well as human objects. We have 
seen this, for example, in the schizophrenic’s 
perception of the vertical (Carini, 1955), in 
his handling of the Bender-Gestalt (Bender, 
1938), in his dealing with the Body Space 
Test (Zierer, 1950), etc. 

I would expect if he were given, for ex- 
ample, a formal test of spatial perception or 
geometry problems in which he can depend 
on his knowledge of rules, propositions, and 
theorems of handling space that he would 
have significantly less difficulty than if the 
task required that he rely on his experiencing 
his relationship to and in space. Perhaps it 
will make it clearer to illustrate the distinc- 
tion between reality testing and a sense of 
reality. Depersonalization, estrangement, and 
hypochrondrasis, often early signs of schizo- 
phrenia, exemplify this difference. The person 
may experience himself as changed, experi- 
ence his world as changed, or experience him- 
self as physically sick, yet at the same time, 
know that he is the same person, that the 
world hasn’t changed, and that his physical 
health is unimpaired. Both reality testing and 
a sense of reality are necessary for dealing 
with the world, but in general, reality testing 
refers to testing something via the intellectual 
processes, and a sense of reality, of experi- 
encing something affectively. The more the 
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task can be adequately dealt with by a pure 
knowledgeable approach, the less, it is hy- 
pothesized, will the schizophrenic fail. Had I 
used the type of control mentioned above, it 
undoubtedly would have tightened the re- 
search considerably. 

Goldberg is correct that I was concerned 
with the adequacy of the tasks. When he says, 
“It is not necessary to question that some 
relationship exists between her tasks and spa- 
tial organization; it is necessary to question 
the control of the ‘other variables’ that may 
also be related to the tasks and which might 
have made Hozier’s results significant,’ I be- 
lieve his concern here is again with the moti- 
vational variables. I have already attempted 
to indicate why I am not concerned 
these variables in the same way as he. 

Recent ECT and Insulin Coma Therapy 
were controlled not because they were ex- 
traneous factors per se, but because of the 
confusion which so often accompanies their 
use. Confusion has not been noted as a con- 
sequent of tranquilizers unless used in toxic 
dosages. In those cases, the patients would 
have had neurological signs and on that basis 
have been excluded. The confusion the pa- 
tients had was part and parcel of their illness. 
Some of the patients were on drugs and some 
not; however, the differences in their perform- 
ances were not analyzed. I would predict that 
of the patients having an equal degree of 
disturbance those on tranquilizers would per- 
form more adequately than those not on 
them. 

As Goldberg indicates, there has been con- 
siderable controversy about the use of the 
one-tailed test. As to the criteria for its use, 
I continue to agree with Jones (1954, p. 586) 

Goldberg has raised some significant and 
crucial questions about my research. Although 
I disagree with him on his major point of the 
control of motivational variables, I would 
agree that some question may be raised about 
the adequacy of clinical judgments of co- 
operativeness. However, as I have thought of 
this criticism, I admit that I find it 
difficult to conceive of an adequate way to 
control this variable which at some level or 
other does not lean heavily on clinical judg- 
ment. His point about the use of a control 
task is well made. I would expect the use of 
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drugs to exert its effect in the direction of 
nullifying the hypotheses. Thus, I doubt that 
the lack of control of drugs contributes to 
the significant results. 

There is considerable need for further re- 
search not only to cross-validate my findings 
with some of the controls mentioned above, 
but to investigate other hypotheses derived 
from the theory. It is impossible for me to 
state the importance of the bodily self in a 
clearer way than Scott (1951), when he says, 
an object of study in 
made so many things 
something that one 


The body image is not just 
the sense that 
objects of study. It is essentially 
lives through and in. To-day when there is so much 
disembodied knowledg« many 
chological forces loose in the belonging to 
no one, and when we are in ve danger of 
being devoured by our reations, it seems sig- 
nificant that the human ] 
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of these should be brought 
is in the last 
(p. 266) 
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The rubric of sex-role differentiation has 
been used to cover both the processes leading 
to sex differences in overt behavior (e.g., boys 
are more physically aggressive than girls) 
and the introjection of values and attitudes 
that are more appropriate in a given culture 
for one sex than the other. A recent attempt 
at clarifying the concept of sex-role differen- 
tiation has been made by Brown (1956, 1957) 
with the assistance of Robert Sears, and Lynn 
(1959). These authors have distinguished 


among sex-role identification, sex-role prefer- 
ence, and sex-role adoption. The development 
of masculinity and femininity is thus defined 
in terms of at least three logically separated 
psychological processes. The present study 


was concerned only with sex-role preference 
which has been defined by Brown (1956) as 
“behavior associated with one sex or the other 
that the individual would like to adopt or that 
he perceives as the preferred or more desirable 
behavior.” 

The sex-role preferences of children have 
been measured in two ways. Rabban (1950) 
developed a technique which consisted of ask- 
ing children to state their preferences from a 
group of 16 toys, 8 commonly associated with 
girls and 8 associated with boys. The admin- 
istration of this test to 300 children from 
three through eight years of age showed that: 
(a) boys possess clear-cut preference patterns 
at an earlier age than girls; and (0b) strong 
appropriate preference patterns appear earlier 
among lower-class children than among mid- 
dle-class children. 

Brown (1956) measured children’s sex-role 
preferences by means of a projective test 
called the Jt Scale for Children. In taking this 
test the child chooses between pictures of 
various objects commonly associated with one 


sex or the other (e.g toys, clothes, household 
objects, games, etc.). The choices are not 
made for the child himself but for “It,” a 
drawing of ,a sexless figure. Brown has re- 
ported normative data for 146 kindergarten 
children from (1956) and 
613 children from kindergarten through fifth 
grade from Pleasanton, California (1957). 
These data have shown that: (a) distinctive 
sex-role preferences existed for boys and for 
girls at all ages studied; (b) kindergarten 
boys were masculine in their preferences but 
older boys were even more masculine in their 
preference scores; (c) kindergarten girls had 
“mixed” preferences and older girls slightly 
masculine preferences; (d) at all age levels 
girls’ preference scores were more variable 
than boys’ scores 
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The present study was designed to extend 
Brown’s work with the It Scale for Children. 
Preferences of the kind measured by the It 
Scale are undoubtedly acquired before the 
child is of kindergarten age. One purpose of 
this investigation was to obtain information 
concerning the sex-role preferences of three- 
and four-year-old children. 

A second purpose of the present study was 
to explore the effects of the instructions em- 
ployed with the It It has been 
tioned previously that the 
choices are mace for an ambiguous drawing 
of a child rather than for S himself. The simi- 
larity of the pictured figure to S has been 
shown to be a relevant variable to children’s 
performance on some projective tests (Arm- 
strong, 1954; Bills, 1950; Furuya, 1957) but 
irrelevant to performance on others (Biers- 
dorf & Marcuse, 1953). If ‘“‘stimulus-to-sub- 
ject similarity” should affect performance on 
the It Scale, it is obvious that any one pro- 


Scale. men- 


subject’s (S’s) 
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cedure would not reveal all the significant Throughout the administration of the test E referred 
facts concerning the sex-role preferences of ‘° the drawing as “It 
children. Group B (moderate similarity to 

Three different sets of verbal instructions We are going to play a game v this little boy 
were employed with the It Scale in the pres- (girl if S were a girl). This 


little | 


gan t about 
ent study. These instructions were designed the you hold him. Now we ar 
going to show this little boy some cards with pic- 


to produce variations in the test situation 
tures on them 


along a dimension from low to high stimulus- 


to-subject similarity. £ continued to use the appropriate noun or pro- 
: , noun throughout the test 
METHOD Group C (high similarity to S 


Subjects We 


are going ‘o play a game with this 
The Ss for this investigation ere 1 children vhose name is S’s name). His 


between the ages of 5 Ss wer »btained ‘ is game wW t about 


two 
one university labor: 
: : continued to usé 
ple included all child 
schools except six childret vho were ab ) lurin The various 
testing, one 
child who r sed to leave th ol group with t pictured toys 
the experimenter sample inch 78 girls first section of the test 
and 83 boys. Sixty-five of the s constitt four rather than in or 
| pictures were grouped 
the young Ss would 
grouping included tw 
arranged in alternating 
group of 16 pictures 
struction groups (Groups , an imer with kindergartners 
I] rra 


ol OS was ran m Kce] that th rroups v aul grouy 


matched as ck is possible for c and ag just described and 
Subsequent to the tion of the yf th tached to 9” by 
161 Ss were 
cell entries in 3 VSS i Varian ib ignt Se oring 
of these Ss were re boy TI 
wert selected randomly fron t shor The weighte 
containing an excessive number of Brown (195¢ 
iie¢ Choices 
Procedure varying a 
he ices were 
The It 
vidually 
room at 
by giving 


| 
wat inn oy 

ta System the possi 

° zero (consistent! 


ently masculine 


Reliability 


Brown has report 
ind .84 for the It S 
Ss days apart 
the scale for Ss of three and f¢ 
rie obtained in the pr tudy 
Forty Ss were 
1 The author are rrateful o tl Stalls of the test. These 
following schools for f cooperation: Day Ca instruction 
Service, Inc., Des in y Plymouth Congre come” basis 
gational Church Nurser hool, Des Moines, Iowa; one of the 


Parents’ Cooperatiy reschool, Iowa City, Iowa; interval between test lay e } 
1 re I 


them 


Preschool Laboratori f th ywwa Child Welfa moment relation bet n tests w 66 for 

Research Station tate \ y of Iowa, Iowa (p< 01) and tor giris These 

City, Iowa cients are fairly hich for tests involving presch¢ 
2 These instructions re identical to those de- Ss with an interval between tests of 


veloped by Brown (195¢ two months 
I 
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When mean scores from the two administrations 
of the It Scale were compared it was found that 
girls moved toward greater femininity on the re- 
test (see Table 1). It is difficult to account for this 
change in terms of inter-S communication, practice 
effects, etc., considering the ages of the Ss and the 
length of the interval between tests. It is possible 
that the test-retest differences reflect new attitudes 
acquired by girls during the two-month interval be- 
tween tests. Interestingly, boys did not move to- 
ward greater masculinity on the retest. This would 
imply that the preschool period may not be so im 
portant for the male as for the 
tional period in sex-typing 


female as a transi 


RESULTS 

The It Scale scores of 150 Ss were ana- 
lyzed by means of an analysis of variance that 
involved three factors: age, sex, and instruc- 
tional procedure. The frequency distributions 
of the 12 subgroups appeared by inspection 
to approximate normality. The variances of 
these subgroups appeared to be homogeneous 
with two exceptions: (a) the variance for 
four-year-old girls in Group A was approxi- 
mately four times the average variance of the 
other subgroups; (4) the variance for four- 
year-old boys in Group C was approximately 
one-third the average variance of the other 
subgroups (see Table 2). While two vari- 
ances thus appeared to be deviant, the possi- 
bility remained that homogeneity might be 
true of the table as a whole due to the ap- 
parent homogeneity of variance in 10 of the 
cells. Accordingly, Bartlett’s test was applied 
which yielded a chi square of 17.86, df = 11, 
p < .10. Thus the hypothesis of homogeneity 
of variance could be sustained. It was de- 
cided, however, that no findings from the 


rABLE 1 


MEAN It SCALE ScorRES ON TEST AND RETEST 


Mean SD 
Girls (V = 20 
Test 
Retest 
Boys (N = 20) 
Test 60.30 


Retest 63.33 


TABLE 2 

DEVIATIONS OF 
SCORES ON THI LE FOR Boys AND GIRLS IN 
Two Ace Grot THREE INSTRUCTION 


MEANS, RANG! ; STANDARD 


analysis of variance would be accepted as sig- 
nificant unless the probability level reached 
was beyond .01. This decision was based on 
recommendations by Lindquist (1953). 

The results of the analysis of variance may 
be found in Table 3. In this analysis the ef- 
fects of sex were highly significant (F 
220.762, p 001). While on the one hand 
this finding merely demonstrates the validity 
of the It Scale, it is also important because 
it demonstrates clear-cut sex-role differentia- 
tion among very young children. 

The age by sex interaction was also signifi 
cant (F = 16.464, p < .001) and may be in 
terpreted as meaning that the effects of age 
on It Scale scores depended upon the sex 
of the child he simple effects of age were 
studied by means of ¢ tests in which the error 
term from the analysis of variance was em- 
fference for girls was sig- 
indicating that 
re more feminine in their 
sex-role preferences than three-year-old girls 
The age difference for boys, on the 
hand, reached only borderline significance (t¢ 

1.840, p < .10). Table 2 shows that the di- 
rection of this trend was for four-year-old 


ployed. The age di 


nificant (¢ 3.925. p< .001), 
four-year-old girls we 


other 


boys to display more masculine preferences 
than three-year-old boys 

The effects of the instructions on It Scale 
scores also depended on the sex of the child. 
The instruction by sex interaction was signifi- 
cant beyond the .005 level. Analysis of the 
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TABLE 3 


ANALYSIS OF VARIANCE OF It SCALE SCORES 


a 


Instructions 

Age 

ex 

Instructions * Age 
Instructions K Sex 
Age X Sex 
Instructions * Age 
W cells 

Total 


nw 
oN NN 


simple effects showed: (a) girls in Group A 
showed less feminine preferences than girls in 
both Group B (“little girl”; # = 3.320, p< 
01) and Group C (S’s name; t = 2.637, p< 
01); (&) mean scores for girls in Groups B 
and C did not differ; (c) boys in Group A did 
not differ from boys in Group B, but were 
less masculine than boys in Group C (t= 
2.019, p < .05);:(d) boys in Group B were 
slightly less masculine than those in Group C 
(¢ = 1.715, p < .10). 

Inspection of the raw data relative to the 
analysis of variance showed confounding of 
both birth order and size of family with the 
variable of age. A greater proportion of three- 
year-old Ss as compared with four-year-olds 
were first born children (p < .10). Also, four- 
year-old boys had more siblings than three- 
year-olds (p < .02). The same tended to be 
true of girls, although only at the .20 level. 
The results of a series of ¢ tests, applied to 
the age differences with both birth order and 
family size held constant, were consistent 
with the analysis of variance results. Four- 
year-old girls were more feminine than three- 
year-olds (p < .01); four-year-old boys were 
more masculine than three-year-olds, but only 
at a borderline level (p < .10). 

The analysis just reported yielded one other 
finding of some importance. There were no 
significant differences between It Scale scores 
of first born and later born children. Thus 
birth order, per se, appeared to have no re- 


lation to the early acquisition of sex-role pref- 
erences. 
Although this study was not designed to 


Sum of Squares Mean Square 


1,193.45 596.73 
506.25 506.25 
51,931.20 51,931.20 
151.64 75.82 
2,813.46 406.73 5.980* 
3,872.99 3,872.99 16.464** 
457.21 228.61 


32,462.63 235.236 
93,388.83 


relate sex-role preferences to socioeconomic 
status, Ss were obtained from two diverse so- 
cial classes. However, 61% of the lower-class 
children (36 Ss from the community-sup- 
ported day care center) had fathers not liv- 
ing at home. When compared with the other 
Ss tested, who were from intact middle-class 
homes, no class differences were found in sex- 
role preference scores for either boys or girls. 
For heuristic purposes, the 14 lower-class Ss 
who came from intact homes were compared 
with the middle-class Ss. Once again, no class 
differences in It Scale scores emerged. 


DISCUSSION 
Sex Differences 

The sex difference found in It Scale scores 
suggests that at least some aspects of sex-role 
differentiation begin very early in life. Be- 
ginning in infancy, parents in the United 
States commonly prescribe different toys, 
modes of dress, etc., for boys and for girls. 
The findings of the present study reflect the 
success of such child training procedures. 

A sex difference in degree of appropriate 
sex-role preferences was also revealed by the 
data. In middle childhood, boys prefer the 
stereotyped masculine role more strongly than 
girls prefer the sterotyped feminine role 
(Brown, 1957). The same appeared to be 
true of the three-year-olds in the present 
study (see Table 2). Among four-year-olds in 
Groups A and C boys also showed slightly 
stronger masculine preferences than girls 
showed feminine ones. Only in Group B were 
four-year-old girls more feminine than boys 
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were masculine. Thus, the majority of these 
data are in line with the findings for older 
children: boys more strongly prefer the stereo- 
typed male role than girls prefer the stereo- 
typed female role. It should be stressed, how- 
ever, that this tendency was considerably less 
marked among the preschool Ss of the pres- 
ent study than among the older Ss of other 
studies. Quite probably the preschooler is only 
beginning to experience: (a) the greater socio- 
cultural advantages and more consistent re- 
wards for sex-typed behavior accorded the 
male in United States culture; (4) the more 
clearly defined sex-role for the male than for 
the female; and (c) more frequent punish- 
ment of boys than girls for opposite-sex be- 
havior. 


Age Differences 


Although only significant at a borderline 
level, four-year-old boys obtained more mas- 
culine scores on the It Scale than three-year- 
old boys. A similar relation between age and 
sex-role preferences has been reported for 
older boys (Brown, 1957). In the present 
study three-year-old boys obtained a mean 
score of 58.47; the 11-year-old boys tested by 
Brown obtained a mean score of 76.73. Thus, 
for boys, there appears to be a steady change 
toward greater masculinity during the 3- 
through 11-year period. This change suggests 
that boys probably receive relatively consist- 
ent reinforcement throughout early and mid- 
dle childhood for adopting certain stereo- 
typed aspects of the male role. This finding 
could also reflect a condition wherein the 
early parental demands for appropriate male 
sex-typing are augmented in later years by 
the demands of other socializing agents (e.g., 
teachers, peers, etc.). 

The present findings showed that four-year- 
old girls made significantly more feminine 
choices than three-year-old girls. By itself, 
this finding is analogous to the age difference 
found for boys. However, Brown (1957) re- 
ported a change toward masculinity in older 
girls: kindergartners obtained a mean score 
of 38.40 while fourth-grade girls reached 
56.40. In the present study the three- and 
four-year-old girls in Group A (whose scores 
are most comparable to Brown’s) obtained 


mean scores of 37.70 and 31.27, respectively. 
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Therefore, preschool-aged girls possess the 
most feminine sex-role preferences of any 
group of girls in the 3- through 10-year pe- 
riod. Actually, the age differences found in 
It Scale scores for girls support an hypothe- 
sis of Lynn’s (1959): girls may be femininely 
oriented in early childhood due to a basic 
identification with mothers, but later 
exposure to the masculine-oriented culture 
breaks down or covers over this identification 
during the years of middle childhood. It could 
be also that the mother, who is virtually the 
sole feminine model for the preschool girl, is 
a model that is particularly stereotyped (i.e., 
the mother’s femaleness is typified for the 
child by homemaking, child care, etc.). On 
the other hand, the older girl has additional 
feminine models available to her (outside the 
home, on fiction, etc.). These 
models are frequently observed in nonstereo- 
typed situations or in situations where their 
behavior is more typical of men than women 
Such a broadened 
role for the older girl could produce more 
masculine scores on the It Scale in which 
femininity is defined in terms of highly stereo- 
typed choices 


their 


television, in 


onception of the female 


Effects of Instructions 


employed 


Instructions tl S’s name in as- 
sociation with the 
masculine 
that degrees of stimulus-to- 
subject similarity. It is possible that 
stimulus-to-subject similarity elicited the most 
valid 
since the questioning was related to S him- 
self. On the other hand, the use 
could concerning E’s 
approval of inappropriate (i.e pref- 
erences. Such anxiety could have elicited an 


resulted in 


than 


drawing 


} 


VDOVS 


more 
scores tor instructiors 
stressed lesser 
high 
measure of boys’ sex-role preferences 
of S’s name 


have aroused anxiety 


female) 


inordinate number of 
preferences. Thus the data indicate only that 
the instructions employed with the It Scale 
affect that sex-role preference scores of boys 
the data do not which 


safe, stereotyped male 


indicate instructions 


produce the most valid scores. 


The finding that no difference existed be- 
tween scores for boys in Group A (“It”) and 
Group B (“little boy”) can be interpreted in 
two ways. First, boys in Group A may have 
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perceived the figure as a boy.* If such were 
the case, the test situation for Group A would 
be very similar to the situation for Group B. 
The alternative interpretation is that Ss in 
Group B were indifferent to the label “boy.” 
This is saying in effect that Ss in neither 
group perceived the figure as a boy, a situa- 
tion which seems improbable. Since the find- 
ings demonstrate that stimulus-to-subject simi- 
larity affects boys’ responses on the It Scale, 
the various post hoc hypotheses concerning 
the dynamics of these effects become impor- 
tant problems for future research. 

The instructions employed with the It 
Scale also influenced the scores of girls. For 
all girls, but particularly for four-year-olds, 
instructions that referred to the figure as “a 
little girl’ resulted in more feminine scores 
than the label “It.” The use of S’s name pro- 
duced little femininity in 
scores. Since labeling the figure 


additional girls’ 
“a little girl” 
produced such marked femininity in 
scores, the question is raised as to whether 
some girls in Group A perceived “It” as a 
boy. The hypothesis that the sexless figure 
was perceived as a boy by some Ss is sup- 
ported by the findings that four-year-old 
girls in Group A had much more variable 
scores than girls in Groups B and C. How- 
ever, variability among Group A three-year- 
olds was not greater than among three-year- 
olds in Groups B and C. If some girls per- 
ceived “It” then, did three- 
year-olds masculine choices? 


girls’ 


as a boy why 
make so few 
Possibly the three-year-old girls did not pos- 
sess adequate information concerning the com- 
ponents of masculine behavior. If so, Ss of 
would 


have been unable to give consistently mascu- 


this age who perceived “It” as a boy 


line choices. On the other hand, the fdur-year- 
old girls may have been 


cated concerning ‘ 


suffic iently sophisti- 
bovishness” that they were 


able to choose appropriately for the figure if 


Subsequent te 


Ss in another 


vestigation, the auth 


nursery school gi what was pictured 
on the card. Six of th s said it was a picture 
a girl more Ss are 
needed before evaluating t! validity of the hypoth - 
sis that the figure is “s, 
preschool 
mentioned only to suggest that this 
not be 


Obviously, data for many 


boyish” figure for 
aged | he results of the poll are 
hypothesis may 
tenable 


they saw “It” as a boy. This interpretation 
would explain the fact that the differences in 
both means and in variability were larger 
among the subgroups of four-year-old girls 
than among the subgroups of three-year-old 
girls. 

These findings carry particularly strong im- 
plications for the use of the It Scale with 
girls. If the unstructured drawing is fre- 
quently seen as a male, then older girls’ re- 
sponses under this condition obviously do not 
indicate the extent to which they prefer the 
female role. 


Birth Order 


The data from this study indicated the ir- 
relevance of birth order to the acquisition of 
sex-role preferences. It should be pointed out 
that it is not known if older siblings of a par- 
ticular sex (e.g., only older brothers or older 
sisters) affect the development of such pref- 
erences. The present investigation did not 
offer a means of checking findings of this kind 
that have been reported by Koch (1956) and 
Brown (1956) 


Social Class 


The fact that mean scores for the two socio- 
economic groups did not differ in the present 
study could possibly be due to the particular 
lower-class population studied. One would 
predict on the basis of Rabban’s (1950) find- 
ings that scores for lower-class children would 
be more sex-typed than scores for middle- 
class children. On the other Pauline 
Sears (1951), in a study of doll play aggres- 


hand, 


sion, found that boys whose fathers were ab- 
sent from the home were less aggressive (and 
as such, perhaps more feminine) than boys 
whose fathers were present. Since 61% of the 
lower-class Ss in this study had‘fathers who 
were absent, it is possible that the effects of 
social class and father absence canceled each 
other out in the comparison between lower- 
and middle-class Ss. 
between the lower class Ss whose fathers were 
present and the middle-class Ss. However, the 
probability of a gross sampling error in this 
comparison is extremely high in view of the 
fact that less than 10 children of 
made up the lower-class sample 


No differences were found 


each sex 
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SUMMARY 


The purposes of this study were: (a) to 
acquire normative data relative to sex-role 
preferences in three- and four-year-old chil- 
dren, and (6) to study the effects of verbal 
instructions that stressed varying amounts of 
similarity between a drawing and S himself 
on a projective test of children’s sex-role pref- 
erences. 

Ss were 161 three- and four-year-old chil- 
dren drawn from four nursery school popula- 
tions in Iowa. Sixty-five Ss, ranging in age 
from 3-0 to 4-0, constituted the three-year- 
old group and 96 Ss, ranging from 4-0 to 
5-0, constituted the four-year-old group. 

Sex-role preferences in this study were 
measured by the It Scale for Children. The 
major findings: (a) clear-cut sex differences 
in It Scale scores were found; (6) girls at 
four years scored significantly more feminine 
than three-year-old girls; (c) four-year-old 


boys were more masculine than three-year- 
old boys at a borderline level of significance; 
(d) girls responded with more feminine scores 
when the drawing employed in the It Scale 
was called “a little girl” than when called 
“Tt”: (e) boys responded with more mascu- 


line scores when the figure was called by S’s 
own name than when the figure was called 
—_" 

While these findings imply that early child- 
hood is an important period in sex-role de- 
velopment, they also imply that the acquisi- 
tion of sex-role preferences by the male is a 
less complicated developmental process than 


for the female. The findings also suggest that 
the measure of sex-role preferences provided 
by the It Scale is highly sensitive to varia- 
tion in the instructions given to the S. 
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PERCEPTUAL-MOTOR DEVELOPMENT IN CHILDREN 
RETARDED IN READING ABILITY’ 


FRANK M 


Finding causes for reading disabilities in 
children has intrigued the members of almost 
every profession having contact with such 
children. Although varied explanations for 
reading disabilities have two 
main approaches to the problem can be 
identified. Reading disability has been con- 
sidered by a number of writers (Jarvis, 1958; 
Vorhaus, 1946) primarily as a 
of emotional disturbances. They discussed 
what reading might symbolize for the child 
(i.e., conforming to environmental demands 
a “feminine” activity 
scious conflicts, etc.) 
might be rebelling 
learn to read. 


been offered, 


symptom 


for boys, the uncon- 
what the child 


when he fails to 


and 
igainst 


A different emphasis is expressed by Bender 
(1949) and Fabian (1945) who view the de- 
velopment of reading ability as dependent 
upon the which result in per- 
ceptual-motor maturity. Following this view- 
point, Silver (1952) includes as specific and 
intergral parts of this syndrome functioning 
on the Visual Motor Gestalt Test on a rather 
primitive level, e.g., difficulty in constructing 
angles and difficulty in figure background per- 
ception. In their studies Fabian and Silver 
found that inferior 
gestalt test performance con with nor- 
These results seem to confirm 
the hypothesis that reading disability is char- 
acterized by perceptual-1 


same factors 


retarded readers show 
pared 
mal readers. 


otor immaturity and 


1 The present paper is < ndensation of part of 


a doctoral dissertation « Northwestern 
University, Evanston, IIlinoi or wishes to 
thank Robert I. Watson for | suggestions, and 
Jacob Cohen of New York for his help 
with the statistical section of this paper and the 
staffs of the Institute for Juvenile Research, Chi- 
cago; Dyslexia Memoria! Institute, Chicago; Bureau 
of Child Study, Chicago; National College of Edu- 
cation, Evanston, Illinois; Michael Reese Hospital, 
Chicago; and Bellevue Psychiatric Hospital, New 
York, for their cooperation in obtaining the subjects 


University 
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hence a developmental lag. Yet these studies 
may be criticized for failing to assess the ex- 
tent to which the emotional conflicts are pri- 
mary and the developmental retardation (the 
reading disability) a reflection of the conflicts. 

Perceptual-motor functioning, usually op- 
erationally defined by performance on the 
Bender gestalt test or other tests of this na- 
ture, has been described by Silver (1953) as 
not only involving visual perception, but also 
the expression of that perception, the result 
reflecting the quality of the perception plus 
the motor impulsivity and the attempts at its 
control. These factors are considered as in- 
separable. In the present study an attempt 
has been made to investigate this develop- 
mental framework and to control for the in- 
fluence of emotional maladjustment in the 
retarded readers. Thus, if reading disability 
reflects a lag or retardation in perceptual- 
motor development, retarded readers should 
show significantly less mature performance on 
the Bender gestalt test than either a compa- 
rable group of normal readers or a group of 
emotionally disturbed children who have nor- 
mal reading ability. 


SELECTION OF SUBJECTS 


Since reading disability is generally defined 
as an inability to learn to read properly in 
spite of normal intelligence, it was necessary 
to obtain a measure of reading ability and in- 
telligence for each subject (S) to be consid- 
ered for inclusion in the present study. To 
ascertain the reading level, the California 
Reading Test (Tiegs & Clark, 1950a, 1950b) 
was employed with reading retardation being 
defined as a score at or below the 20th per- 
centile for the S’s chronological age and nor- 
mal ability as a score at the 40th percentile 
or above. Normal intelligence was defined as 
an IQ score of 90 or more on the verbal sec- 
tion of the Wechsler Intelligence Scale for 
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TABLE 1 


DESCRIPTION 


Reading 
Disability 
Group R 


Variables } RO 


Number of Subjects 0 
Ratio 
Age in Months 
Mean 
SD 


Boys: Girls 


Intelligence Level (IQ 
Mean 
SD 

Reading Level in Months 
Mean 
SD 


® WISC Verbal IO 


Verbal IQ or S-B 


Children (WISC) (Wechsler, 1949), or the 
Stanford-Binet test (S-B) (Terman & Mer- 
rill, 1937) or the Kuhlman-Anderson Intelli- 
gence Test (K-A) (Kuhlman & Anderson, 
1952).? 

Three groups of 40 children were selected. 
Group R consisted of children meeting the 
above criteria with respect to reading retarda- 
tion and normal intelligence as measured by 
the WISC. They were selected from clinics 
for remedial reading. Children referred to 
mental hygiene clinics for psychological or 
psychiatric diagnosis or treatment who were 
of normal intelligence and normal reading 
ability comprised Group E. The reasons for 
which these children had been referred to the 
clinics may be summarized in three overlap- 
ping catagories: behavior problems in school 
(disturbance in class, truancy, etc.) ; unman- 
ageable behavior at home (stealing, temper 
tantrums, etc.); “nervous symptoms” (fears, 
enuresis, etc.). The normal group (Group N) 
was obtained from a public school and did not 

2 Because of time limitations it was not possible 
to use the same intelligence test for all children. 
Correlation among the scores of the three tests 
(Pastovic & Guthrie, 1951; Traxler, 1941) for simi- 
lar age groups and within the IQ range of the pres- 
ent study suggests sufficient comparability of scores 
to permit using them for equating the groups 


14:6 


OF GROUPS 


Diagnostic Gro 
Emotionally 
Distur 
Gr 


EY 


include any children having a record of re- 
ferral for diagnosis or treatment because of 
emotional difficulties 

Each of these groups were divided into sub- 
Ss each according to age. The 
younger half (Subgroup Y) ranged in age 
from 8 to 9 years and 11 months, and the 
older half (Subgroup O) from 10 years to 11 
years and 11 months 
tion concerning the 
Table 1. 

For Groups E and R, if a WISC or S-B 
than one year old was available, 
and for Group N if a K-A score less than one 
year old was available, no additional IQ tests 
were given the investigator indi- 
vidually administered the verbal section of 
the WISC. The reading test was individually 
administered to the 


groups of 20 


More detailed informa- 
contained in 


groups is 


score less 


Otherwise 


children in Groups R and 
E and in a group administration to the chil- 
dren in Group N 


Since reading disability is noted more fre- 


quently among boys, the groups were equated 


as to ratio of boys to girls, as well as age and 
IQ. The ¢ tests computed between mean read- 
ing scores at comparable age levels for Groups 
E and N did not reach statistical significance 
An analysis of variance computed from the 
IQ scores of all six groups did not yield a sig- 
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nificant F value. This suggests that the groups 
were adequately equated on the basis of read- 


ing ability and intelligence. 


METHOD 


rhe nine figures comprising the Bender gestalt test 
were individually adn ding to the usual 
procedure as described | Bender (1938 Five types 
which, according to Bender (1938) 
perceptual-motor 
} 


inistered accé 


of distortions 
are suggestive of immaturity in 
development when still pl 
eight vears, ( 
or absent for each 


ove the age of 
d as being either present 


niuced a 
were then sé 
' 


} ] 


ild. Definitions of the distor 
ions were taken fron nder } and from the 
Bender 
Pascal and Sutte 


ingulation 


gestalt test scoring criteria developed by 


1 


Diunt- 
direc 


ng ce 


tion of 


rners, adding extra 


angles, omitting gaps at 
(1953) 
rtion to reading 

Figures A 4, 


showed the aistorth scor as 


eighth incl ilver 
of thi 


corners greater 


has related the 


presence 
disabilit 
present tor the S 

Rotation: scored as present when in the reproduc 
tion of a figure the r axis of the stimulus figure 
had been turned legr No 


when the S turn 


score was given 


o make the most eco 
nomical us¢ I 1 abian 145 and ver 5 
have related i 

ing disability 


Primitivati 


o reading disturba 


Separation: separating the adjoining or overlap 
eighth inch or more. It 
A 4.7.o0r8 


Slant: drawing é lum of circles in Figu 


pendicular to tt rizontal axis of the figur 


ping parts of fig 


slanting them in th pposite direction. If two 
imns or more \ r listor ! ent 
given 
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A tally was made of the number of children for 
whom these distortions were scored present. Table 2 
summarizes this data for each of the six groups 


ANALYSIS OF THE DATA 


A relatively new method of data analy- 
sis, variously called multivariate information 
transmission analysis (McGill, 1954) and un- 
certainty analysis (Garner & McGill, 1956), 
was employed in the analysis of the data. 
This method makes possible the analysis of 
nominally scaled dependent variables in fac- 
torial design formats and can be used as a 
nonmetric and nonparametric analogue of the 
analysis of variance (Attneave, 1959; Garner 
& McGill For designs in which the 
effects of orthogonal independent variables on 
a dependent variable are to be studied, the 
significance of interactions as well as main 
effects can be determined. The functions 
studied are amounts of “information trans- 
mitted” (or degree of uncertainty reduction) 
measured in “bits,” for either main effects 
(T’) or interactions (A’). The product of a 
T’ or A’ value with the constant 1.3863N is 
distributed as chi square for the appropriate 
A bias in 
the resulting chi squares may be corrected by 
subtracting from it the df for that 
(Attneave, 1959), 


The amount of 


l 56). 


number of degrees of freedom (df). 


source 


as can be seen in Table 3 
information transmitted 


from each source, its df, and the resultant 

Attneave 1959 suggests the correction of the 
T’ or A’ value for bias. It is, 
onstrable that the method used here is equivalent, 
and for the purpose of significance 


tionally 


however, readily dem 


testing, compu- 


simpier 


rABLE 2 


wa mh Ww 


a 


awn 





Frank M. Lachmann 


TABLE 3 


MULTIVARIATE INFORMATION 


AND TYPE OF DISTORTION ON THI 


Source 


(Age; P-A 

(Diagnostic Group; P-A 

(Type Distortion; P-A 

Age, Diagnostic Group, P-A 

Age, Type Distortion, P-A 

(Diagnostic Group, Type Distortion, P-A 
(Age, Diagnostic Group, Type Distortion, P-A 


TY 
T 
T 
A’ 
A’ 
A’ 
A’ 


T’ (Age, Diagnostic Group, Type Distortion; P-A 


chi square are additive. The over-all sum re- 
flects the total amount of information trans- 
mitted between the independent variables 
taken together and the dependent variable or 
the amount of uncertainty reduction, or more 
generally still, the association. It is analogous 
to a multiple correlation with the dependent 
variable, when all variables are metric. 


RESULTS 


As applied to this problem, age (the three 
younger subgroups compared with the three 
older subgroups), diagnostic group (compari- 
sons among Groups R, E, and N), and type 
of distortion are the independent variables 
and presence or absence of the distortion 
(P-A) is the dependent variable. Table 3 sets 
forth these results. 

The highly significant total amount of in- 
formation transmitted indicates that the joint 
effect of age, diagnostic group, and type of 
distortion is a real one. This effect may, how- 
ever, be trivial since the effect of type of dis- 
tortion is merely due to the fact that the five 
distortions investigated are of different inci- 
dence in the population. It is in fact this 
source that accounts for the lion’s share of 
the total amount of information transmitted. 
None of the interactions (A’) even approach 
significance. The data thus fails to indicate 
the existence of any patterns of type of dis- 
tortion as a function of age, or of diagnostic 
group, or of age and diagnostic group. 

The analysis of the data does indicate, how- 
ever, that the age difference contributes sig- 
nificantly to the presence of the distortions, 
the younger groups providing the greater num- 


TRANSMISSION ANALYSIS OF THE 


f Transmission Amount (Bits) d/ 


EFFECTs OF AGI 
PRESENCE-ABSENCE OF D1 


, Diacnostic Group, 


RTION 


Corrected 
Chi Square 
00917 
00887 
.11844 
00497 
00380 
01126 
00600 


16251 106.173 


ber of distortions. Since the two age groups 
studied differed by about two years, the sig- 
nificance of the difference between them at- 
tests to the ability of the Bender gestalt test 
to differentiate along a chronological age con- 
tinuum. Thus the five distortions investigated 
seem to be affected by maturational factors 
between ages 8 and 12. This finding confirms 
(1938) that the 
test is sensitive to differences in perceptual- 
motor ability up to the 
The differences an 


3ender’s descriptive data 


ge of 12 years. 
ng Groups R, E, and 
N meet the .05 criterion of significance for 
the uncorrected chi square. This would sug- 
gest group differences in the incidence of the 
distortions. However, upon correction the sig- 
nificance the slips to the .07 
level. It was deemed worthy to follow this up 
and to attempt to identify the source of the 
difference more specifically. Accordingly, fur- 
ther complete multivariate transmission analy- 
ses were performed on the diagnostic groups 
taken a pair at a time. In these analyses, it 
found that total distortion 
cantly greater in the 
the normal group (T’ 


of difference 


was was signifi- 
retarded readers than in 

.01239, corrected chi 
square = 5.870, p 02), and tended to be 
greater in the ret readers than in the 
emotionally group (T’ = .00780, 

chi square 3.325, p< .07). In 
these supplementary analyses, the interactions 
also failed to approach significance. 

The presence of these distortions readily 
distinguishes children with reading disabili- 
ties from normal children. However, when 
emotionally disturbed, normal readers are 
compared with reading retarded children, the 


irded 
disturbed 
corrected 
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distortions do not distinguish so efficiently. 
The difference here, just below significance, 
suggests that though reading retardation may 
well reflect immaturity in perceptual-motor 
development, this developmental hypothesis 
cannot be used to account in full for the re- 
tardation. Emotionally disturbed, normal read- 
ers offered these distortions too frequently to 
permit their exclusive association to reading 
retardation. 

The findings of the present study lend sup- 
port to the importance of the developmental 
concept in understanding reading disability. 
The question of the etiology of this develop- 
mental lag should still be raised. The present 
study can offer no evidence on this point. 


SUMMARY AND CONCLUSIONS 


The relationship between perceptual-motor 
development and reading disability was in- 
vestigated. The Bender gestalt test was ad- 
ministered to children retarded in reading 
ability, children defined as emotionally dis- 
turbed but normal readers, and normal chil- 
dren. These groups were matched on a num- 
ber of pertinent variables. 

It was hypothesized that the following 
sender gestalt test signs would discriminate 
among the three diagnostic groups and be- 
tween two age levels within these groups: (a) 
difficulty in constructing angles, (6) rota- 
tion of figures, (c) primitivation of figures, 
(d) separation of adjacent parts of figures, 
(e) inability to maintain slant of figure. 

The presence or absence of these distor- 
tions, the dependent variable in a four dimen- 
sional information transmission analysis, was 
affected by the type of distortion and by the 
age variable. The younger Ss produced more 
distortions than the older group. 

A suggestive finding regarding the fre- 
quency with which the distortions occurred in 
the three diagnostic groups was explored. The 
distortions were indeed offered more fre- 
quently by the reading disability children 
than by the normal children. When the read- 
ing retarded and the emotionally disturbed 
children were compared, the significance of 
the difference fell just below an acceptable 
level. 

The developmental hypothesis has received 
some support from these findings though it 


cannot claim to explain fully the presence of 
reading disorders. No etiological implications 
can be drawn. 
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AN EXPLORATION OF THE RELATIONSHIP BETWEEN 
HYPNOTIZABILITY AND ANXIETY 
AND/OR NEUROTICISM ' 


FRED HEILIZER * 


University of Rochester 


Studies by Eysenck (1943b, 1944, 1947) 
and Himmelweit, Desai, and Petrie (1946) 
indicate the existence of a positive correla- 
tion between neuroticism and hypnotizability. 
In the most comprehensive of these studies, 
Eysenck (1947) found a perfect positive rank 
order correlation between neuroticism and 
postural sway for groups of neurotics and 
normals. In the same publication, Eysenck 
(1947), on the basis of a 47-item question- 
naire, demonstrated that “suggestible neu- 
rotics and nonsuggestible neurotics differ with 
precisely the same factor [neuroticism] as do 
neurotics as a whole and normals” (p. 272). 
In another study, Eysenck (1944) reports 
that the neurotics swayed significantly more 
than the normals, and concludes “that there 
is a close relation between primary suggesti- 
bility and neurosis” (p. 410). In a third 
study, Eysenck (1943b) found that neurotics 
were not responsive on the postural sway test 
when they were first admitted to the hospital 
and that their postural sway scores decreased 
during a period of four weeks at the hospital. 
He interprets this finding as supporting his 
hypothesis that neuroticism and hypnotiza- 
bility are positively related by assuming that 
hospitalization resulted in a psychological im- 
provement in the patients. Himmelweit et al. 
(1946) found a tetrachoric correlation of .51, 
based upon a dichotomy between neurotic and 
normal subjects (Ss), between postural sway 
and psychiatric (neurotic) diagnosis. 

In a second group of studies no relation- 


1 This paper was abstracted from a dissertation 
submitted in partial fulfillment of the PhD degree 
I would like to express my gratitude to G. R. Wendt 
and E. L. Cowen for their interest and assistance in 
this study. 

2 Now at the VA Hospital, Northampton, Mass. 


ship between neuroticism and hypnotizability 
was found. Davis and Husband (1931), 
Bartlett (1936b), and Messer, Hinckley, and 
Mosier (1938) report no relationship be- 
tween neuroticism, as defined by the Thur- 
stone or Bernreuter Personality Schedule, and 
hypnotizability, as defined by a depth of 
hypnosis scale or by postural sway. Bartlett 
(1936a), using hospitalized neurotics and nor- 
mals, found no difference between the two 
groups on the postural sway test. On the ba- 
sis of data collected from hospitalized neurotic 
soldiers, Eysenck (1943a), in an early study, 
concludes that there is “good agreement with 
the findings of Bartlett, who found no differ- 
ences between neurotics and normals in the 
3ody Sway Test” (p. 30). Ingham (1954) 
reports that two of Eysenck’s students found 
that neurotics and normals did not differ sig- 
nificantly on the postural sway test. 

Despite the general superiority of the first 
group of studies, in terms of the NV utilized 
and the specification of the neuroticism di- 
mension, there are two sets of findings which 
indicate that the conclusion drawn by Ey- 
senck and Himmelweit et al. (that there is 
a general positive relationship between neu- 
roticism and hypnotizability) may be prema- 
ture. Benton and Bandura (1953) report a 
nonsignificant correlation (r = .19) between 
static ataxia (a indicator) and 
postural sway when utilizing normal Ss. This 
is consistent with a rarely quoted datum (7 
06 between static ataxia and postural sway 
with normal Ss) reported by Evsenck (1947, 
p. 277). The second set of findings is pro- 
vided by Ingham (1954, 1955) who reports 
(a) that neurotics demonstrate significantly 
more postural sway and arm movement sug- 
gestibility than do normals, but (6) that 


neuroticism 
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there is no evidence of such a difference if 
baseline activity (without direct repetitive 
verbal suggestions) is controlled. Further, 
Ingham (1955) reports that neurotic Ss who 
had taken sedatives the night before scored 
significantly higher on the arm movement test 
than did neurotic Ss who had not taken seda- 
tives. However, it is not clear on the basis of 
these data whether the hypnotizability scores 
are related to neuroticism per se, to the taking 
of sedatives per se, or to a 
these factors 


combination of 


PROBLEM 


The purpose of this study was to investi- 
gate the relationship between neuroticism and 
hypnotizability among normals, utilizing (a) 
an adequate sample, (5) more precise meas- 
urement of the postural sway and static ataxia 
variables than is usually made, (c) the heat 
illusion test of hypnotizability in addition to 
postural sway, and (d) a sampling of meas- 
ures of neuroticism and anxiety. The neuroti- 
cism and anxiety concepts were used inter- 
changeably on the theoretical grounds that 
the two concepts are closely related ( Fenichel, 
1945; May, 1950; White, 1948). A support- 
ing datum for such usage has been gathered 
by Franks (1954) who found a correlation 
of .92 between the Maudsley Medical Ques- 
tionnaire, a widely used measure of neuroti- 
cism, and Since 


the Taylor anxiety scale. 


there is no exact counterpart of a psychiatric 


interview with students and 


since the problem of selectively validating one 


normal college 


or a few measures of anxiety or neuroticism 
has not yet been satisfactorily resolved, a 
variety of measures were used. Although such 
a procedure may not be conducive to an im- 
mediate decision to accept or reject the null 
hypothesis, it would seem to be more con- 
ducive to a fruitful 
these circumstances 


line of investigation in 

There are two methodological usages in the 
studies outlined above which are central to 
this study. (a) Hypnotizability has been al- 
most invariably defined in terms of the pos- 
tural sway test. In only one study (Davis & 
Husband, 1931) was hypnotizability defined 
in terms of a depth of hypnosis scale. (5) 
The studies which indicate the existence of a 
positive correlation between neuroticism and 
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hypnotizability defined neuroticism in terms 
of either static ataxia or the Maudsley Medi- 
cal Questionnaire. In evaluating the relation- 
ship between neuroticism and hypnotizability 
among normal Ss, then, the most direct com- 
parability involves postural sway as the inde- 
pendent variable and static ataxia and the 
Taylor anxiety scale (see above: Franks, 
1954) as the dependent variables. The use of 
the heat illusion test of hypnotizability and 
the additional measures of anxiety and neu- 
roticism constitute an attempt to extend the 
generality of this investigation. 


INSTRUMENTS 


Hypnotizability, the degree or depth to 
which a person can be hypnotized, was de- 
fined primarily in terms of reaction to the 
postural sway test (Furneaux, 1952; Heilizer, 
1959) and secondarily in terms of reaction to 
the heat illusion test (Eysenck & Furneaux, 
1945; Furneaux, 1946; Furneaux, 1952). The 
postural sway test was introduced to S as a 
test of motor activity. A control period and 
an experimental period of 24 min. each were 
utilized. S was instructed to relax and stand 
still during the control period. The experi- 
mental period consisted of direct repetitive 
suggestions to S that she was falling 
ward. Measurements of deviations from the 
baseline were made at 5-sec. intervals. The 
30 measurements for each period were aver- 
aged and the difference between the two av- 
erages constituted the postural sway score 
The heat illusion test was introduced as a 
measure of sensory threshold. S was shown 
how a small heating element became hot as a 
pointer was moved along a scale from 0 to 
100. The heating element was then placed 
against S’s forehead and she moved the 
pointer along the scale until she reported 
feeling the heat. A second trial was run with 
the heating element disconnected. If S re- 
ported feeling the heat on this trial, she was 
given a positive (hypnotizable) score. 

The dependent measures were derived from 
five instruments or tasks: the control period 
of the postural sway task, the Biographical 
Inventory, the Bills-Vance-McLean Index of 
Adjustment and Values (the Bills), a paper- 
and-pencil mirror tracing task, and stories 
written in response to TAT cards. 


for- 
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The static ataxia measure (Eysenck, 1947; 
Himmelweit, et al., 1946) was derived as the 
standard deviation of the 30 measurements 
taken during the control period of the pos- 
tural sway task. The A Scale (Taylor, 1953) 
and the Lie Scale (Welsh & Dahlstrom, 1956) 
were utilized from the Biographical Inven- 
tory. The self-concept—ideal-self discrepancy 

(Column IV) was used from the Bills (Bills, 
Vance, & McLean, 1951). Time (in seconds) 
and error scores were derived from a mirror 
tracing task which utilized a six-pointed star 
(Waters & Sheppard, 1952). The error scores 
were recorded in terms of 0.1” unit excursion 
amplitude by summing the largest single in- 
side and outside excursion for each of the 
12 straight-line segments to produce a total 
error score. The Discomfort Relief Quotient 
(DRQ) (Dollard & Mowrer, 1947) was 
scored in phrase units from stories written 
in response to three TAT cards (7BM, 12M, 
6BM). In addition to the DRQ, an Emo 
tionality scale, a Distance scale, three Con- 
striction scales, and a Time/Phrase scale 
were utilized. The Emotionality scale was de- 
rived as the ratio of the number of phrases 
expressing emotion (relief or discomfort) to 
the total number of phrases. The Distance 
scale was based on the assumption that a per- 
son who typically reacts to — 
ing situations by withdrawing will, when writ- 
ing a story in response to a TAT oak refrain 
from projecting himself into the story; he 
will write a story qualified by statements 
which detract from the projective quality. 
Such statements were classified 
headings: 
Personal references (e.g., “I would say 
“T think ...;” “I believe . . “As 

you can see . . .;”) 
Probability which the out- 
come is undecided or doubtful (e.g., “Either 
or ...;” “Probably 
” “Seems 


yroduc- 


under four 


statements i! 


“Perhaps 
“Appears to be tag 


“Assuming .. .;”’) 

3. Concrete statements (e.g., “The 
ters are ‘The outcome is 

is a picture of ...;” “In the 
>” “In this he says . . .;”) 

4. Vague statements, referring to a vague 

“something” or to indefinite ages (e.g., ““Some- 

thing is bothering him “He is think- 


charac- 
‘This 


first slide 


Fred Heilizer 


ing of something in his past . 
or 26 years old . . .;” “She 
ena 25... 3"). 

The score 


“He is 25 
between 20 
was derived as the ratio of the 
number of statements reflecting distance to 
the total number of phrases. Constriction 
were derived from the DRQ, Emo- 
tionality, and Distance scales by dividing the 
scores on the respective scales by the total 
number of thus giving added weight 
to the produced by reducing the 
scores total number of 
phrases. This procedure was based on the as- 
sumption that one of the usual reactions of 
anxious Ss, when placed in a fairly unstruc- 
tured situation, is to withdraw by produc- 
ing as little as possible, that is, constriction. 
The Time hee utilized only for the 
second TAT below), was derived 
as a rate-of-production indicator: the ratio of 
the amount of time, in seconds, to the to- 
tal number of phrases. Experimenter’s (E’s 
scoring reliability, based upon the same phrase 
units for each scoring of 50 


scales 


phrases, 
amount 


according to the 


S( ale, 
testing (see 


randomly se- 
for the 
scales, re- 


lected stories, 
DRQ, En 


spectively. 


was .95, .85, and .98 


1otionality. and Distance 


SUBJECTS AND PROCEDURES 


Ihe Biographical Inventory was distributed 
to a class of 135 female introductory psychol- 
ogy students by the instructor with a request 
to complete it at home for purposes of stand- 
ardization. Of the 135 students, 128 
pleted the Inventory, 
teered for this experiment, 


com- 
Biographical 75 volun 
and 62 completed 
(Fifty nine of the 
study also completed 
entory.) 
A Scale 
volunteers and 


two groups were not 


the experimental situation 
62 Ss 
the Biographical Ir 


made 


completing the 
Comparisons 
and Lie Scale 
nonvolunteers. The 
significantly different 
(A Scale: 26: F 1.38: Lie Scale 
t= 0.24, F = 1.06). Thus, it that 
the Ss utilized in this study were a repre- 
sentative sample of the class of 135 students 

The Bills and the three TAT cards 
presented in group sessions. A 6-min. time 
limit was allowed for each TAT card. The 
following tasks were then presented in indi- 
vidual sessions: postural sway, heat illusion, 
the three TAT cards, and mirror tracing. The 


were between the 


scores of 


appears 


were 





Hypnotizability and Anxiety and/or Neuroticism 


TAT cards were presented with no time limit 
to 49 randomly selected Ss. Originally, we 
had expected Card 12M to elicit many stories 
concerning hypnosis which could be analyzed 
as to content and outcome. However, only 15 
stories about hypnosis were elicited during 
the group testing and many of these were 
only incidentally concerned with hypnosis. 
Therefore we repeated the TAT presenta- 
tions with the incidental remark preceding 
presentation of Card 12M: “This is an inter- 
esting one. Did you know that almost every- 
one makes up a story about hypnotism for 
this card?” This resulted in 11 hypnosis 
stories, several of which were only inciden- 
tally concerned with hypnosis. 


RESULTS AND DISCUSSION 


The relationship between postural sway and 
each of the dependent variables was evalu- 
ated by Pearson and curvilinear correlations. 
(The curvilinear correlations were computed 
with postural sway as the independent vari- 
able.) Fisher z 
were computed 
division of 
(Heilizer, 
nificance reached the .05 level: a Pearson cor- 
relation of .29 for Emotionality (second TAT 
testing); curvilinear correlations of .77 and 
.53 for mirror tracing time and error scores, 
respectively (in which the deviations from a 
zero, linear slope were confined to the high 
postural sway 65th per- 
centile—and were in the shape of an in- 
verted U); and phi coefficients for DRQ 
Constriction (—.38), Emotionality/Constric- 
tion (—.30), and Time/Phrase (—.30), all 
of the second TAT testing. 

Neither static ataxia 
iety scale \ ielded a significant 
with postural sway despite 
precise measurement of the postural sway 
and static ataxia variables than had previ- 
ously been made in the pertinent literature. 
We conclude, then, that the positive correla- 
tion between neuroticism and hypnotizability 
reported by Eysenck and Himmelweit et al. 
does not occur among normal Ss when hyp- 
notizability and neuroticism are defined in 
terms of similar instruments. 

If the remainder of the dependent vari- 


scores and phi coefficients 
65th percentile 
the postural sway distribution 


1959). Six of the 76 tests of sig- 


pnd 


utilizing a 


above the 


group 


nor the Taylor anx 
relationship 


the use of a more 


ables are divided into measures with currency 
in the literature (Lie Scale, Bills, mirror 
tracing, DRQ) and measures which were de- 
rived for this study on rational grounds (Emo- 
tionality, Distance, the three Constriction 
scales, Time/Phrase) there are no distinctive 
trends in either group, with 2 out of 24 
evaluations reaching significance in the for- 
mer group and 4 out of 44 evaluations reach- 
ing significance in the latter group. 

The relationship between heat illusion (27 
positive and 35 negative responses) and each 
of the dependent variables was ascertained by 
the phi coefficient. Two of the 19 tests of sig- 
nificance reached the .05 level: the phi co- 
efficients for Distance (.36) and Distance 
Constriction (.29), both of the first TAT 
testing. 

Thus, on the basis of the few significant 
relationships and the inconsistencies among 
the significant data, we conclude that the 
null hypothesis cannot be rejected for either 
the primary or secondary measures of hyp- 
notizability or neuroticism and/or anxiety. It 
is possible that one or more of the significant 
results may reflect a true relationship. How- 
ever, the assertion that one or more of the 
significant results does reflect a true relation- 


ship would be acceptable only upon replica- 
tion. 


It seems appropriate to conclude that the 
positive relationship between neuroticism and 
hypnotizability reported by Eysenck and Him- 
melweit et al. ‘oes not apply to normal Ss. 
At this point, Ingham’s data, indicating that 
hypnotizability may not be related to neu- 
roticism per se, assume added importance. It 
is possible that the relationship between neu- 
roticism and hypnotizability among neurotics 
is the result of a drug usage which does not 
occur to an appreciable extent among nor- 
mals. If this proves to be the case, then no 
simple relationship between neuroticism and 
hypnotizability can be stated, and the drug 
action might profitably become the focus of 
research interest. 

SUMMARY 

This study was designed to examine the re- 

lationship between hypnotizability and neu- 


roticism among normal Ss, utilizing (a) an 
adequate sample, (5) more precise measure- 
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ment of the postural sway and static ataxia 
variables than is usually made, (c) the heat 
illusion test of hypnotizability in addition to 
postural sway, and (d) a sampling of meas- 
ures of neuroticism and anxiety. Sixty-two 
college female volunteers completed the pos- 
tural sway and heat illusion tests for hyp- 
notizability and a variety of tests of neu- 
roticism and anxiety. 

The null hypothesis could not be rejected, 
thus leading to the conclusion that the posi- 
tive relationship between neuroticism and hyp- 
notizability reported to occur with neurotic 
Ss does not occur with normal Ss. It was 
speculated that the relationship between neu- 
roticism and hypnotizability among neurotics 
is the result of a drug usage which does not 
occur to an appreciable extent among normals. 
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PERSONALITY CHARACTERISTICS ASSOCIATED WITH 


RESISTANCE 


CARROLL 
Vanderbilt 


A number of studies have shown that peo- 
ple misperceive, change their opinions, or 
otherwise behave so to indicate that they 
ing in response to the 
pressure of the group, to authority, or to some 
of influence (Asch, 1952; Hov- 
land, Janis, & Kelley, 1953; Hovland & 
Weiss, 1951; Sherif & Harvey, 1952). These 
researches have demonstrated that a number 
ial conditions will brir such 

On iverage, experimental sub- 
jects (Ss) exposed to one of 


as in 


are yielding or conforn 


other 


ot 


sO 


g about 
change. the 
these conditions 
will yield, misperceive, or change their stand 
on an issue significantly more than a control 
group. Since the individual members of a 
group—even an entally manipulated 
differ in it is not sur- 
prising that psych 
study the persor 

ple who yield in vary 
influence 1959; 
These studies have not 


‘n 
group tant wavs, 
‘ete 


ferences 


beginning to 

imong peo- 
ing degrees to outside 
McDavid, 1959). 


\ ieldec cata 


(Boomer 
on 
cific or well delineated personality variables. 
One reason s the lack of 
ality measures adequate to the task. 

Recent h has pointed to Edwards 
Personal (EPPS) a 
ing Bernardin and Jessor 

utilizing the EPPS to measure per- 

tested the hypothesis 
Autonomy and low on 
less dependent than a 
group low on Autonomy and Defer- 
The Ss group were exposed to 


spe- 


for this wa person- 
resear 
Preference 


possibility 


Schedule as 


promis 
(1957) 
sonality characteristics 
that a group high on 
Deference would be 

high on 


ence. 


a Vanderbilt- 
Harvey, Prin- 


1 This study 
ONR contract Nonr 2149 ; J 
cipal Investigat he opinior nd conclusions do 
not necessarily reflect yf the Navy Depart- 
The ted to Harvey for valuable 
zestions, regard to the pro- 
of influence 


ment auth 


with 


etiects 


TO CHANGE’ 


E. IZARD 


University 


experimental situations designed to elicit be- 
havior relevant to three aspects of depend- 
ency: (a) reliance on others for approval, 
(6) reliance on others for help, and (c) con- 
formity as measured in an Asch-type group 
situation. The high Autonomy, low Deference 
group significantly less dependent in 
terms of the first two criteria but not on con- 
formity. They thought that the Autonomy- 
Deference scales failed to predict conformity 
of the difference between the 
behavioral situations involved in responding 
to the EPPS and the concrete reality of the 
Asch-type group situation. Gisvold (1958) at- 
tempted to check on this latter proposition. 
He modified Asch’s method so that the pres- 
sure to conform was artificially produced- 
an S did not see or hear the actual responses 
of the rest of the group but was under the 
illusion that the three incorrect responses ex- 
hibited by the experimenter (£) were the 
opinions of the other three Ss in the experi- 
mental situation. Under these conditions Au- 
tonomy correlated significantly with conform- 
ity but Deference did not. Correlations for 
other EPPS scales and conformity were not 
reported in either of the foregoing studies. 

In the present study an hypothesis relating 
to a third type of social influence was formu- 
lated for Autonomy and Deference as well as 
for Dominance and Abasement. Since the cri- 
terion measure was in terms of resistance to 


was 


scores because 


change in response to an authority figure’s at- 
tempt to influence, it was hypothesized that 
Autonomy Dominance would correlate 
positively with the criterion while Deference 
and Abasement would correlate negatively. 
The selection of these EPPS scales as pre- 
dictors of behavior change in this experi- 


and 


mental situation was largely on the basis of 
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the judged relevance of the item content of 
these scales to the predicted behavior. Ac- 
cording to the item content of the scales: 
Autonomy relates to a person’s preference 
for behaving in terms of his own thoughts 
and feelings independently of the thoughts 
and feelings of others; Deference relates to 
a person’s preference for seeking the opinion 
and advice of others and for looking to others 
for decisions and leadership; Dominance re- 
lates to a person’s preference for a super- 
ordinate role, for arguing for his own point of 
view; Abasement relates to a person’s pref- 
erence for giving in rather than fighting to 
have his own way and to a tendency to 
feel inferior to other people. The reports of 
Bernardin and Jessor and Gisvold, published 
subsequent to the completion of the present 
experiment, lend some support to the hy- 
pothesis. However, the former authors did 
not report the separate correlations of Au- 
tonomy and Deference with their criteria of 
dependency or conformity, and as already in- 
dicated the latter found no significant rela- 
tion between Deference and conformity. Re 
sults favoring the present hypothesis will 
throw further light on the validity of the 
selected EPPS scales for predicting Ss’ re- 
sponses to interpersonal influence and will 
help identify the personality characteristics 
associated with tendency to change. 


PROCEDURE 


The 39 Ss were all the members of a class in psy 
chology. They were given the EPPS during a class 
hour early in the semester. Following this, each S 
was asked to meet with E individually to participate 
in an experiment in judgment. On arrival at the 
laboratory, S was told that he would be taken into 
a dark room where he would estimate the distance 
between two points of light. S was seated in the 
dark room and given pencil and pad with which he 
was to record his estimates in inches. The stimulus 
lights were exhibited for .3 sec. with an interval of 
4 sec. between them and with an interval of 5 se 
between each pair. They were approximately eye 
level, 15 ft. ahead of S. The horizontal distance be- 
tween the points of light was fixed at 24 in. through- 
out the experiment. Each S made 120 judgments in 
all, with approximately 1 min. break between each 
set of 30. At the end of the first set, S was told: 

You have just seen the standard series of lights 


which varied from 


(S’s minimum estimate) 
(S’s maximum estimate) inches apart. I 
shall continue to show you lights within this range, 


except that occasionally I'll show a pair of lights 
definitely farther apart than any in the standard 
series. For, the rest of the experiment, you should 
say “standard” if the 
range —_. to 

should say 
tinctly 


series. 


pair of lights is within the 
which you just saw. You 
“longer” if the pair of flashes are dis- 
farther apart than any in the standard 


S was then shown % 
the fixed 24 in 

It was considered that in the process of making 
the first 30 estimates S formed a concept of the dis 
tance or range of distances between pairs of lights 
in the series. Thus, for each subsequent set of 
judgments, the score for resistance to change of con- 
cept was taken as the largest number of consecutive 
“standard” responses. The 
change of 


more pairs of lights, each at 
apart 


final index of resistance to 
concept for a given S was obtained by 
summing the three resistance scores from the three 
sets of 30 judgments 


RESULTS 


Table 1 shows the correlations of the 
EPPS scales with the criterion of resistance 
to change. The correlations for men and 
women are reported separately, since the 
sexes have different means on the EPPS vari- 
ables under consideration. Since a directional 
hypothesis was made, the one-tailed test was 
applied. 

For the men, the 


four 


correlations were all in 
the predicted direction. The correlations of 
Autonomy and Dominance with the criterion 
were significant at the .05 level of confidence 
and that for Abasement closely approached 
this level. The 
the criterion was in 
but failed to reach 
women, none of the 


significant at the 5 


correlation of Deference and 
the predicted direction 
significance. For the 
four correlations 
le vel 


were 
The correlation 


rABLE 1 


CORRELATION 3} EEN THE PERSONALITY 
CHARACTERIST } rHE CRITERI 


REsI . TO CHANGE 
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Dominance 
Deference 


Abasement 





Personality and Resistance to Change 


for Deference and the criterion was in the 
right direction and approached the magnitude 
of the correlations for the men. 

For both sexes, correlations were computed 
for the 11 EPPS variables for which no hy- 
potheses were made. A two-tailed p of .01 
was required for these correlations, in view 
of the number computed and the lack of di- 
rectional hypotheses. On this basis, only the 
correlation for men of —.53 
and the criterion approached significance. 
Only two other nonsignificant correlations 
were of the order of magnitude reached by 
the significant hypothesized correlations—for 
Succorance r was .49 for women and for En- 
durance r was 


between Order 


35 for men. 
DISCUSSION 


It is difficult to compare these results with 
those of the two previous studies relating 
personality (EPPS) variables and change. 
Neither of them report separate correlations 
for men and women. Both 


studies utilized 


only the Autonomy and Deference scales and 
one (Bernardin & Jessor, 1957) used the two 
scales as a multiple criterion for grouping 


without giving any indication of the contribu- 
tion of the separate scales to the prediction 
of hypothesized behavior. All three studies 
used different methods of exerting social influ- 
ence or pressure to change. 

If both Autonomy and Deference con- 
tributed to the differentiation of dependent 
and independent Ss in the Bernardin and 
Jessor study and if susceptibility to change 
is indeed a part of the dependency construct, 
then Autonomy is the one EPPS variable that 
had a measure of validity in all three studies. 
Sernardin and Jessor did not get a difference 
between their and_ independent 
groups in an Asch-type situation, but Gisvold 
did when he modified the Asch technique so 
that Ss could conform without publicly an- 
nouncing a response contrary to a concrete 
reality situatior. In the present study Ss could 
likewise conform or be influenced in privacy. 

The difference between the relationships of 
EPPS variables and the criterion for men and 
women could mean that the personality vari- 
ables lead to different responses in the two 
sexes, that the experimental situation had a 
different effect, or that the interaction of per- 


dependent 
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sonality and situational variables was differ- 
ent for males and females. It is entirely pos- 
sible that judging the distance between lights 
in a dark room could have quite different 
meaning and different degrees of relevance for 
the sexes. 

Although none of the correlations for the 
EPPS variables judged irrelevant to the cri- 
terion actually reached significance at the re- 
quired level, two of them were between the 
.05 and .01 level and were equal to or larger 
than the hypothesized correlations. Further, 
the correlations for these variables were con- 
sistent in direction for both sexes. These vari- 
ables and the correlations with the criterion 
for males and females, respectively, were: Or- 
der, —.53, —.28; and Succorance, .19, .49. It 
is conceivable that Ss characterized by high 
Order may have felt considerably disoriented 
and insecure in the relatively unstructured, 
ambiguous experimental situation. If so, it is 
reasonable that they should be more suscep- 
tible to outside influence—instructions which 
gave additional structure and order to the 
situation. A possible, though highly specula- 
tive, explanation for the correlations of Suc 
corance with the criterion is that Ss high on 
Succorance perceive attempts to change or in 
fluence them as a threat to their need for ac- 
ceptance, affection, and encouragement. 

The interpretation of the four variables in- 
volved in the hypothesis of this study is sim- 
pler, at least for male Ss. Autonomy and 
Dominance have a degree of validity for pre- 
dicting resistance to interpersonal influence 
as measured in the experimental situation. It 
can be said that each of these scales measures, 
in part, a facet of behavior on the autonomy- 
heteronomy The two previous 
studies this generalization with re- 
spect to Autonomy. One of these studies, and 
the present one to a lesser degree, found that 
Deference aspect of be- 
havior on this dimension for both men and 
women. 

The fact that the male Ss varied in the ex- 
tent to which they changed their perceptual- 
cognitive response to a stimulus situation un- 
der influence of E’s instruction and the fact 
that this change tended to be associated with 
certain operationally defined personality char- 
acteristics have implications for various inter- 


dimension 


support 


also measured an 
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personal situations. Such situations include 
some approaches to psychotherapy and coun- 
seling, where the advice, information, sugges- 
tion, or interpretation of one person is in- 
tended to influence another. Males low on 
Autonomy and Dominance and high on Abase- 
ment might be expected to change to a greater 
extent. There was some evidence that high 
Deference would contribute to susceptibility 
to change in both males and females. 


SUMMARY 


It was hypothesized that the personality 
characteristic of Autonomy, Dominance, De- 
ference, and Abasement, as measured by the 
EPPS, would correlate with change of con- 
cept under influence of E’s instructions. The 
39 Ss, all the students in one undergraduate 
class in psychology, were first given the EPPS. 
Ss were then taken individually to the labo- 
ratory where they were given the opportunity 
to form a concept of the distance between 
two stimulus lights exposed in a dark room. 
Following this, E gave instructions calculated 
to influence S to change his concept of the 
distance, although the actual 
mained the same. 

For male Ss the hypothesis was confirmed 
for two of the four personality characteristics 
at less than the .05 level of confidence and 
for another at approximately the .05 level. 
Autonomy and Dominance correlated posi- 
tively with resistance to change and Abase- 
ment correlated negatively. The correlation 
of Deference and resistance to change was 
not significant but was in the predicted direc- 
tion for both men and women. The implica- 
tions of this finding for the validity of the 
EPPS and for certain interpersonal interac- 
tions, including psychotherapy and counsel- 
ing, were discussed briefly. 

The hypothesis was not confirmed for fe- 


distance re- 


Carroll E. Izard 


male Ss. It was considered a possibility that 
the differences in correlations for men and 
women were artifacts relating to the experi- 
mental situation 

None of the 22 correlations for the EPPS 
variables not included in the hypothesis was 
significant at the required .01 level. However, 
two of the variables had correlations which 
were between the .05 and .01 levels and which 
were corroborated by correlations in the same 
direction for the other sex. Tentative inter- 
pretations of the meaning of these variables 
in relation to the experimentally evoked be- 
havior were offered 
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CONSTRUCT VALIDITY OF THREE MASCULINITY- 
FEMININITY TESTS 


Masculinity 


Femininit) f F ) 


usually developed usi 


tests are 
differ- 
iously, the con- 
nnot en- 

much sim- 
pler criteria are available for this 


The construct 


whic h 
entiate men from won 


struct validity of these rest 
tirely on this differenti 
purpose. 
must also 
r variables 
are several 


be based on fre ] i 


a priori reas¢ 


relate th 


juantitative 
in men might 
correlate with these 

On the basis of 
might 


] exper . 7 
lated to 


marital 
sumption that a 
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riage and pare 
might be expect 

core thar i 

it a younger age | 

higher masculinity scores than men who marr 
who have more chil 


at an older age, | 


dren might be have higher 


mas- 
culinity score 
children. 
Psychoanalytic that men 
who marry women older than themselves are 


looking for a ‘“mother-substitute.” Such men 


might be expected to be lower in masculinity 
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than men who marry women younger tha 
themselves, since the latter supposedly have 
resolved their “Oedipus Complex” through 
identification with their father 

Jung (1939) has 


nine ¢ omponent in 


suggested that the femi- 
men (“anima’’) 
age. This hypothesis is sup- 
ported by data of Terman and Miles (1936) 
who found a decline in masculinity scores in 
high school. One expect 
similar results on the M-F tests 
used in this study 

Most of the aforementioned hy; 
highly 


would 


becomes 


stronger as they 


after might 


somewhat 


votheses are 


speculative and negative evidence 
not be 


or the valid ty 


crucial for either the theories 


of the tests. Positive evidence, 


however, would support the theories and test 
than one M-F 


in the expec ted direction 


validity, particularly if more 


test vields results 


SUBJECTS 


The subjects (Ss) in this study were 2,296 
males employed at “white-collar 


Canadian business 


jobs in a 


large institution who 


were administered a battery of tests in con- 
nection with a 


ment. Th 


program of personnel assess- 
ove the general Ca- 
itional 


elementary 


nadian pr ti edu achieve 


ment ie) completed 


school; 38 had one to three years of high 
school; 39 had four years of high school; 
19% had five years of high school; and 3% 
had ; rhe age of 
the group ranged from 17 to 56 (see Table 4 


for the distribution). 


one to tour vears yf college. 


TESTS 


tests administered to 


The the group in- 
cluded intelligence, aptitude, achievement, and 
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personality tests. The analyses in this study 
will concern only the following tests: 

1. Masculinity—Femininity tests. (a) Guil- 
ford-Zimmerman Temperament Survey (G-Z) 
M-F scale, (5) Minnesota Multiphasic Per- 
sonality Inventory (MMPI) M-F 
the Strong Vocational M-F 

2. Intellectual Abilities. (a) Otis Quick- 
Scoring Mental Ability Test, (4) Watson- 
Glaser Critical Thinking Appraisal, (c) Thur- 
stone Mental Alertness Test, (¢) Cooperative 
English Reading Comprehension Test, (e) 
Cardall Arithmetical Reasoning Test. 

3. Vocational Interest. The Kuder Prefer- 
ence Record—Vocational Form C. 


scale, (c) 


cale 


MARITAL STATUS VARIABLES 


The following marital status variables were 
analyzed in relation to M-F tests: (a) single 
vs. married status (excluding widowed and 
divorced Ss): (6) number of 
wife’s age in (older, 
younger, or same); (d) age of S at marriage 

One further variable analyzed in relation 
to M-F was the S’s age 


children: (c) 


relation to husband’s 


RESULTS 
Correlations between Measures of M—-F 


The correlations between the measures of 
M-F are contained in Table 1. In this table 
and all others giving correlations between the 
MMPI M.-F test, the correlations involving 
the MMPI M-F are reversed in sign because 
this test is scored in the feminine direction 
while the G-Z and M-F tests are 
scored in the masculine direction. All three 


Strong 


rABLI 


INTERCORRELATIONS AMONG GUILFO ZIMMERMAN 
MMPI, anp Stronc Mascutinity—FemiIninity 
ScaALes* 


N = 2296 


MMPI 
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correlations are significant and of about the 
same magnitude (.3). 

Correlations between the G-Z and Strong 
M-F scales and the other scales contained in 
these tests were available. These correlations 
are of some interest since they enable one to 
evaluate the validity of the correlation be- 
tween M-—F scales using the multitrait-multi- 
method system suggested by Campbell and 
Fiske (1959). In this case, however, there is 
only one trait being evaluated, M-F. These 
authors list several requirements for validity: 

1. The validity correlation between two 
variables using different techniques to meas- 
ure the same trait should be significantly dif- 
ferent from zero and large enough to encour- 
age further examination of validity. The cor- 
relation of .34 between the two M-F scales 
was highly significant but its magnitude is 
questionable. Cor large N_ in- 
volved in the correlation there is some justifi 
cation in proceeding to the next criterion. 

2. The validity correlation should be higher 
than the correlations between the variables 
and other variables that have neither con 
struct nor method i1 
tion between the M-F 


sidering the 


common. The .34 correla- 
scales is higher than 
the correlations of the M-—F 
other G-Z scales and higher than the 
tions of the G-Z M-F 
scales. The data sat 
3. The validity, 
than 


Strong with 9 
correla- 
with 10 other Strong 
isfied the second criterion 
rrelation should be higher 
he variables and 


correlations between 


t 
other variables which are designed to get at 


different traits but happen to use the same 
method. The .34 M-F 
scales is lower than one of the nine correla- 
tions between G-Z M-F and other G-Z scales 
G-Z M-F correlated .35 with the Objectivity 
scale. The validit irrelation is lower than 2 
of the 10 correlations between Strong M-~—I 
and the other Strong scales. Strong M-—F cor 


orrelation between 


Production Manager 
Certified Public A 

Che data do not satisfy 
the last criterion although the number of cor- 
relations indicating the greater influence of 
method than construct is not excessive. Fur- 
thermore, it should be noted that the scales 
on the Strong were developed from the same 
pool of items so that correlations between 


scales may be a function of item overlap. 


related .55 with the 
scale and .43 with the 


countant, Senior scale 
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TABLE 2 
CORRELATION BETWEEN MEASURES OF MENTAL 


ABILITY AND MEASURES OF MASCULINITY 


V 2296 


G-Z Strong MMPI 


14° 
15° 
04 

18° 


26° 


orrelations of M—F with Measures of Ability 

The G-Z M-F positively and 
ignificantly with all measures of ability, ver- 
bal and quantitative Table 2). The 
Strong M-F correlates positively and signifi- 
cantly (using the 001 level) 
with the Otis, the Thurstone Quantitative 
subtest, and the Cardall Arithmetical Reason- 
ing Test. Unlike the previous two tests, mas- 
culinity on the MMPI correlates negatively 
with all ability measures-and all the correla- 
tions are significant except the 
ith the Thurstone Qu: 


correlates 
(see 


conservative 


correlation 
intitative subtest. 


Correlations of M—F with Vocational Interests 


Correlations of the three measures of M—F 
with the scales of the 
can 


Kuder interest scales 
be seen in Table 3. The general pattern 
of these correlations is nearly the same on all 


TABLE 


MMPI 


Outdoor 14 07* 
Mechanical 26* 
( omputi tional . 13* 
scien tile 

Persuasive 

Artistic 

Literary 

Musical 

Social Service 

Clerical 


TABLE 4 


AGE AND MASCULINITY—-FEMININITY 


G-Z f MMPI 


slinit 
Cubnily 


Femininity 


three measures 
and Scientific interests correlate positively 
with masculinity; while Artistic, Literary, 
Musical, and Clerical interests tend to cor- 
relate negatively with masculinity. Outdoor 
interests correlate positively with masculinity 
on the G-Z and Strong scales but negatively 
with masculinity on the MMPI. 


Mechanical, Computational, 


Age and M-I 


Ss were grouped in four age groups as 
listed in Table 4. The M-—F scores in these 
groups were compared using single-classifica- 
tion analyses of variance. Significant F ratios 
between age groups were found on the Strong 
and MMPI M-F scales. Looking at the group 
means on the Strong, it is apparent that the 
two younger groups (ages 17-36) are higher 
in masculinity than the two older groups 


(ages 37—56). 


The relationship is less clear 


on the MMPI where femininity 
crease from the youngest to the 37-46 
group and then decrease in the 47-56 
group. 


scores in- 


age 
age 


rABLE 


DIFFERENCES BET . 
MEN on MASCULINI 


M 44.70 
SD 10.00 


Married 


VU 
SD 
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TABLE 6 


AT MARRIAGE 


AND MEASURES 
Masct , 


LINITY—F EMINDD 


MMPI 
Fe rit 


3.60 
1.0 

4.60 

14.99 
3.60 


Marital Status and M-—F 


The married group was significantly higher 
than the single group on masculinity as meas- 
ured by the G-Z, but they were higher on 
femininity as measured by the MMPI 
Table 5). 


(see 


Number of Children and M-F 
Married Ss were divided into five 
no children, one child, two 
children, and 
classification 


groups: 
children, three 
children. Single 
variance did not 
yield significant F ratios on any 
M-F scales. 


four or more 
analyses of 


of the three 


Wife’s Age in Relation to Husband’s and M—F 

A comparison of the three 
older, wife younger, and wife same age, did 
not yield significant F ratios on any of the 
three M-F scales. 


wife 


SroUuns 
4 oups, 


Age-at-marriage and M—F 

The Ss were divided into four groups on 
the basis of their age at time of 
(see Table 6). The F ratios between groups 
were significant (.05 level) for the Strong 
and MMPI M-F scales. On both scales, the 
relationship is similar: going from the younger 


marriage 


age at marriage to the older age at marriage, 
one sees a decrease in mean masculinity scores 
(Strong) and an increase in mean fe: 
scores (MMPI). 


nininity 


DISCUSSION 


The correlations between the three tests 
which purport to measure the trait mascu- 
linity-femininity are low, albeit highly sig- 
nificant. Nance (1949), who intercorrelated 
the same tests using a sample of male students 
in a college of education, found two of the 
correlations to be somewhat higher (G-Z and 


MMPI, .43; G-Z and Strong, .28; Strong and 
MMPI, .51). Considering the size of our sam- 
ple the reliability of the obtained correlations 
is high. A question does arise about possible 
national, differ- 
Considering only 


educational 
ences between populations. 
the results obtained on the population sam- 
pled in this study, it is apparent that al- 
though the three tests have some communal 
ity, the major part of their variance is not 
accounted for by the common factor. This -is 
apparent in other aspects of the results, i.e 
the finding that the G-Z and MMPI M-F 


te results in correlations with 


vocational, and 


tests give oppx 


ability measures and in comparisons of mar 

ried and single groups. 
Application of 

teria to the St: 

weak 

ured by thes« 

tions with vo is apparent 

that all the tests relate i mall but 


cant degree tor 


mpbell and Fiske cri- 
G-Z tests indicated 


construc lity for 


M-F as “as- 
idering the correla- 


signifi- 
i in feminine interest 
patterns. A glance a > M-F te eveal 
that a ir ij 


terms 


portion stated in 
of vocati 


Masculinit; 
related with 


G-Z i | sitively cor 
ires of ability. The re 
lationships a1 ore pronounced on quanti- 
tative than 1 verbal tests. 
the Strong seems to be 


Masculinity on 
more highly related to 
ol ability in conformance 
Abilities on the 
correlated with mascu 
] 


quantitative 
with 
MMPI are 
linity. The ilts are less pronounced on the 
two quantitative tests. The results on the 
MMPI can be compared with those 
stein (1954 that 


students score high on femininity relative to 


the stated | ypothe sis. 


eserals 
ega ively 


ol Good- 


found male college 


the general pop number of items on 
the MMPI M-I 
feminine. An 

is feminine while an 
hunting is masculi 

a librarian or journalist is 
wish to be a soldier or 
line. While ar 
activities and vocations 
cation or 


score cultural interests as 
st in poetry or the theater 
interest in sports or 

a wish to be 
feminine while a 
contractor is mascu- 
in certain 
ming from edu- 
may indicate greater 
stereotype of the general 
doubtful whether this is a 
in the more educated. It is 


reased interest 
stem 
intelligence 
femininity (in the 
population ) 

meaningful t: 





Construct Validity of Three M-F Tests 445 


not clear why masculinity on the G-Z is posi- 
tively related to measures of ability other 
than quantitative ability. The G-Z often pairs 
vocations or 
level in the items, e.g 
study mathematics and science than literature 
and music, T or F.” Perhaps forcing a choice 
makes the more intelligent males subordinate 
general cultural interests to specific vocational 
interests which require high intelligence 

The only one of the marital status 


interests of equal intellectual 
“You would rather 


vari 
ables giving consistent results on at least two 
tests was age-at-m Those 
ried when young had higher n 
MMPI 
older As 


“success index”’ 


irriage who mar- 
asculinity scores 
those 
the larger 
study a nputed for 
every S in terms of average annual 
ments in salary over tl! 
ment with correctior change in the 
value of the dollar. The mean success 
values for the 
were as follows: 


> 7 
6-30. 13. 


on the Strong and than who 


married when 
incre- 


of employ- 


index 
four groups 
under 2 25, 19.7; 
" over 31, 9.1. Differences between 
all groups were statistic lly significant and in- 
dicated that those men who married earlier in 
life had been more successful in this organiza- 
tion. In view of the associati ve-at-mar- 
riage with both the su index and Strong 


ind MMPI M-1 


lationship between mas 


‘ .s =e 
one on x 


some re- 
na > 
and success 
09, but 


Evidence from 


These correlations are ind 
highly significant (p 
I ‘r tests related to 


' 
ted tl 


success index indi- 
vat the ictive, | restrained 

men had grea 

pulsive aspect of masculinit' nat be 


ssive, im- 
what 
y masculine 
These men may find it easier to ap- 


s to earlier marri 
men 
proach and establish vith members 
of the opposite sex 
The ev iden e on 


and 


lat between age 
masculinity is ambiguous. Although the 
Strong scale indicates a sharp drop in mascu- 


the MMPI 
isculinity up to 46 and 


linity scores after 36 y f age 
ates a drop in 


ase r he | age 


group (47 


ns apparent to the investigators that 
masculinity—femininity is not a clearly de- 
fined construct. The tests used contain a mix- 


ture of interests and emotional attitudes. In- 


tratest factor analyses might help clarify the 
construct that the tests purport to measure. 


SUMMARY 


The Guilford-Zimmerman, Strong, and 
MMPI Masculinity-Femininity tests were 
given to 2,296 employees of a Canadian busi- 
ness institution along with tests of mental 
abilities and vocational interests. The 
tionships between these tests, and the rela- 
tionships and certain 
marital status and age variables were investi- 
gated. The correlations 
M-—F tests were 
Masculinity as 
Strong 
abilities 
Mast 


] ‘ 
tively wi 


rela- 


between masculinity 
three 
but highly significant. 
G-Z and 
correlate positively with 
particularly quantitative ability 

MMPI correlated nega- 
h the ability particularly the 
All tests correlated sig- 
Mechanical, 
cor- 
masculinity while Ar- 
, Clerical, Musical, and Literary interests 


between the 
low 
neasured by the 


tended to 


ulinity on the 
' ‘th tests 
nonquantitative one 
nificantly wi scales 
Scientific, and Computational 
related positively with 


interests 


ist 
tended to correlate negatively with 
linity. Married Ss 
linitv on the G-Z, but single 
on the MMPI. Number of children and the 
relation of the wife’s age to the husband's did 


ascu- 


scored higher on mascu- 


Ss were higher 


not yield significant differences. Age-at-mar- 


riage liff 


significant « between 
on the Strong and MMPI scales 
Those married at younger ages tended to have 
higher masculinity scores. S’s age yielded sig- 
nificant differences between groups on the 
Strong and MMPI M-F scales 


with 
ency for masculinity to drop with increasing 


vielded erences 


four grou] 


a tend- 


ige, but the trends were not completely linear. 
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AGGRESSION AND THE PICTURE-FRUSTRATION STUDY’ 


J. KASWAN, 


University of California, Los Angeles 


M. WASMAN, ann LAWRENCE ZELIC FREEDMAN 


Yale University 


The Rosenzweig Picture-Frustration Study 
(P-F) is widely used as a clinical and re- 
search measure of verbal aggression, although 
its validity has not been clearly established. 
Most validating studies have attempted to 
correlate a specific criterion of aggression 
with the P-F scoring term “extrapunitiveness” 
(E)—defined as the percentage of responses 
in which the individual turns his aggression 
outward, against the environment. For exam- 
ple, Fry (1949), Holzberg and Hahn (1952), 
Vane (1954), and Weinberg (1952) have all 
found that juvenile delinquents or adult 
criminal offenders fail to manifest more E on 
the P-F than nonantisocial subjects (Ss), and 
have therefore concluded that the P-F is not 
valid. That such studies may represent too 
narrow an approach to validation is indicated 
by the APA Committee on Test Standards 
(1954, pp. 14-15) who note that test validity 
in the case of most clinical instruments must be 
evaluated by integrating evidence from many 
different sources and that often no single cri- 
terion measure or composite criterion can be 
identified. Thus aggression may occur in 
many different degrees, in many different 
situations, and in many different forms. Con- 
versely, similar degrees of aggression could be 
expressed in different ways. It therefore may 
not be reasonable to expect that any particu- 
lar measure of aggression which is arbitrarily 
selected as the single and specific criterion 

1A version of this 
Eastern Psychological 
delphia, April 12, 1958 

This study was carried out as part of a research 
project on nonconformist behavior conducted within 
the Department of Psychiatry, Yale University, di- 
rected by Lawrence Z. Freedman, and supported by 
the State of Connecticut and Foundations Fund for 
Research in Psychiatry. The orientation of the paper 


does not necessarily reflect the theoretical frame- 
work of the over-all study. 


paper was presented at th 
Association Meeting at Phila 
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out of the total spectrum of aggressive be- 
havior should relate to P-F performance. 
This paper represents an attempt to in- 
vestigate the validity of the P-F by relating 
the quantity and intensity of aggressive re- 
sponses obtained on it to aggression as de- 
scribed by various other techniques. We as- 
sume that if the P-F does refer to aggression 
it should relate to a substantial proportion 
of the other measures of aggression 
A measure of the intensity of aggressive re 
the be- 
lief that mild verbal aggression, such as “I 
don’t characterize a different 
reactive process to frustration than that re- 
flected by responses containing references to 


violent, uncontrol 


action to frustration was designed in 


agree,” might 


led aggression, such as “T'll 
break your neck.” The failure of the 
P-F scoring system to take into account such 
variations in intensity has been pointed out 
by Holzberg and Hahn (1952). This addi- 
tional expected to broaden the 
range of relationships between the P-F 
other measures 


present 


measure is 
and 
of aggression 

The investigation utilizes part of the data 
collected in an interdisciplinary study of anti- 
offenders classified as sexual, 
sive, or acquisitive. Psychiatrists, psycholo- 
and to a limited degree, 
lawyers, cooperated in obtaining data by the 
techniques appropriate to each discipline in 
such a way as to allow comparable statistics 
and other analyses 
tion was thus 


social aggres- 


gists, sociologists 


\ great deal of informa- 
compiled for each individual, 
providing possibilities for describing and re- 
lating behavior in a variety of contexts 


PROCEDURE 
Subjects 
The consisted of 121 
prison inmates selected so as to 
prison population 


male state 
match the total 
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on eleven variabk 
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1945) and 54), where t 
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It is recognized that this approach is quite subjec- 
tive but the authors 
method which can successfully deal 
schach as a Clinical instrument 
largely unconscious feelings 


ther 
Ror- 
tap 


awal of 
with 


designed to 


are not any 


the 


Space does not permit the listing of the signs used 
as indicators of all the but th 
used in the evaluation of degree of rressivity may, 
serve as an example: Frequency of E 
hostility signs; blood; splatter fluids g 
high visceral anatomy; fire; high FM wit 
(example: rate “ f re than 20% FM 
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very critical of cards; rej ‘ard IX reje 
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not classified either “aggressive” or “murderer” even 
though he may have been convicted for murder 
Twenty-two | 


variables comprising all the 


bearing on aggressive behavior of 


data 
sufficient numeri 
cal range for statistical analysis were abstracted from 
these techniques and P-F performance by 
f E and 

moderate, and low 
of chi square (Coch- 
ran, 1954) necessitated the collapsing of some con- 


chi square see Table 
IE scores 
groups 


listributions 


were split into high 


Statistical requ ments 
tingency tables and in all such cases the upper ex- 
tremes (high scores) contrasted with the com 
} 


bined moderate and low scores 

RESULTS 
Table 1 indicates that the mean extrapuni- 
tive, intropunitive, and impunitive scores for 
our sample closely matched the revised norms 
reported by (1950). The scores 
appear to be some- 
variable on all three measures 
than those obtained from his normative sam- 
ple. The mean for the intensity 
extrapunitiveness was | 
deviation of .32 for 


Rosenzweig 
for our sample, however 
what more 
rating of 
standard 
a three point scale, indi- 
cating that most Ss tend to express a rather 
mild intensity of aggression 


with a 


to this technique 
as measured by our ratings 

The interrelationships of P-F measures were 
determined by and chi square 
analyses and are reported in Table 2. E wa 


found to be 


correlational 


significantly correlated with the 
IE measure, so that frequency and intensity 
f 


of aggression appear positively associated. E 


was inversely related to intropunitiveness and 
impunitiveness as a result of the dependent 
scoring system. However, the association of 
both high E and high IE with both high ego 
defensiveness and low need peristence scores 
represents a relationship between independ- 


ent scoring categories. The latter measures 


are two of Rosenzweig’s three “types of Ag- 
gressive Reaction.”’ According to Rosenzweig, 
Ss who express intense and/or frequent out- 
ward aggression on the P-F tended to respond 
in a manner “in which the ego of the subject 
predominated than in a manner in 
which “the solution of the frustrating prob- 
lem is emphasized 
Clarke, 1947, p. 16¢ 
Twenty-three out of an total of 
73 relationships tested by chi square were 
significant at the .05 level or beyond, while 
12 more approached significance at the .10 
level. In relating E 


rather 


(Rosenzweig, Fleming, & 


over-all 


and IE to measures of 


TABLE 1 


AND STANDARD DEVIATION 


FRUSTRATION StupY VARIAB 


R z g’s Norms in Parent! 


Me in 


47.49 
26.40 
6.26 


15.71 
10.01 


973 


1.29 315 


aggression elicited by other techniques, 12 of 
22 relationships for E and 8 of 22 for IE 
were found to reach the .10 level or beyond. 
The total number of significant findings is 
thus well beyond the number expected on the 
basis of chance alone. 

Chi squares significant at the .05 level or 
beyond revealed that E was associated with 
the S’s report of aggression and negative feel 
ings in object relationships, as with fathers 
and wives; the perception of peers as hostile 
and threatening; a history of antisocial ag- 
gression; and murder in the present offense. 
Moreover, the data reflect a number of trends 
which fall between the .05 and the .10 level. 
High E tends to be associated with the psy- 
chiatrist’s rating of high verbal aggressiveness 
and, as judged from the Rorschach, poor con- 
trol of aggression, the use of 


a defense mechanism, 


proj ction as 
iagnosis of 


and the « 


lia 
schizophrenia. The only findings which seem 
inconsistent with the 
sults was a trend for high E 
frequent fighting with peers. 


general direction of re- 
Ss to report in- 


The IE measure was found to correlate .4§ 
with E—a moderate but statistically signifi- 
cant degree of correlation. It is not surprising, 
therefore, that IE related to many of the same 
measures as E. Thus, like high E, high IE 
to aggre and negative 
in object relationships, as with siblings and 
wives; to a history of antisocial aggression; 
and to murder in the present offense. In ad- 
dition, high IE was found to be significantly 
associated with poor control of aggressive im- 
pulses and the expression of intense affect on 
the Rorschach. 

Both E and IE were unrelated to the de- 
gree or amount of aggression manifested on 


relates *ssion feelings 





J. Kaswan, M. Wasman, and L. Z. Freedman 


TABLE 2 


THE RELATIONSHIP OF EXTRAPUNITIVENESS (E) AND INTENSITY OF EXTRAPUNITIVENESS 
IE) To AGGRESSION VARIABLES 


High E Scores are Related to High IE Scores are Related to 


Variable Description f Chi square* Significance leve 

P-F Variables 

High extrapunitiveness* 

High extrapunitiveness intensity 

Low intropunitiveness 

Low impunitiveness 

Low object dominance 

High ego defensiveness 

Low need persistance 


hi square* Significance level 


Rorschach 
Degree of aggression (73%)* 
Inadequate control of aggression (65% 
High degree of intense affect (62° 
Use of projection as a defense (7 
Diagnosis of schizophrenia (65% 


c 
207 
« 


Attitude Scale 
Reaction to frustration 
Reaction to criticism 
Perception of peers as hostile 
Negative feeling toward father 


Nm NM hw he 


Psychiatric Schedule 

Subject’s rating of self-control over 
anger 

Psychiatrist’s rating of subject's ag 
gressiveness 

Tendency to express aggression 
verbally (Psychiatrist’s rating of 
manner in which subject expresses 
aggression) 

Submissive behavior toward father 
(developmental) 

Frequent aggressive behavior toward 
younger male siblings 

Infrequent fighting with peers (de 
velopmental) 

Frequent destruction of property 
(developmental 

Fighting with peers (current) 

Frequent verbal aggression toward 
wife 

Frequent physical aggression toward 
wife 

History of serious anti-social aggression 

Present offense classified as sexual, ag 
aggressive, or acquisitive 


Murder in present offense 


* Yates correction in two by two tables 
>» Two-tailed. 

¢ Exact agreement of two raters on three 
idf = 1. 

*r = .48 significant at the .01 level 
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the Rorschach, S’s report of his reaction to 
frustration and criticism, S’s rating of his 
self-control over anger, interviewer's rating of 
S’s aggressivity, current fighting with peers 
and present offense classified as sexual, ag- 
gressive, or acquisitive. 

Both amount and inténsity of extrapuni- 
tiveness were compared with age, race, re- 
ligion, rural-urban background, social class, 
marital status, education, IQ, and length of 
time in prison. Those Ss with high E tend 
to be older (xy? = 9.86, df=2, p= .01), 
widowed or divorced (,’ 3.71,df=1, p 
.06), and to have an education beyond the 
8th grade (x? = 7.17, df= 2, p= .04). Ss 
with high IE tend to come from an urban 
background (,’ 1.56, @€/=2, »= #8). 
None of the other comparisons yielded chi 
squares approaching significance 

The relationships of intropunitiveness and 
impunitiveness to other measures discussed 
above were investigated but results have not 
been presented since they were generally 
found to be the inverse of those relationships 
reported for extrapunitiveness by virtue of 
the reciprocal nature of the scoring system. 


Dise USSION 


This investigation has provided evidence 
that the P-F relates to a variety of manifesta- 
tions of aggressive behavior as assessed by a 
psychiatric interview, Rorschach evaluation, 
and an attitude scale. The scoring dimension 
of E related to aggression in terms of hostile, 
aggressive interpersonal feelings and behav- 
ior, and antisocial aggression with tendencies 
toward the use of projection and inadequate 
impulse control. The score for IE developed 
in this study and the standard scoring dimen- 
sion of extrapunitiveness (E) show consider- 
able overlap, the former also relating to the 
adequacy of impulse control and intensity of 
affect on the Rorschach. Determination of the 
value of the intensity measure in extending 
the range of the P-F scoring system is a prob- 
lem for further research. 

The results also indicate that higher edu- 
cation and IQ are positively related to high 
E and high IE. An additional analysis of the 
data revealed that IQ and education were not 
significantly correlated with other variables 
bearing on aggression so that no attempt was 


made to control their effects statistically. One 
could speculate that the P-F technique was 
most suitable for eliciting aggressive responses 
from Ss in our sample who are better able to 
deal with verbal stimuli or to express aggres- 
sion verbally. 

The P-F failed to relate to many aspects 
of aggression, such as type of criminal offense, 
reported behavior in response to frustration 
and criticism, and aggressive behavior toward 
peers. These measures which did not relate to 
the P-F seem no reasonable criteria 
than those which had a positive relationship 
to it. The problem of evaluating validity in 
terms of several relationships, some of which 
are significant or 
others are not 


less a 


nearly significant while 
may be viewed in at least 
two ways. One is to postulate that items 
which relate positively to the P-F should 
characterize some particular aspect of ag- 
gression (e.g., level). Examination of Table 2 
indicates that for our results such a con- 
ceptualization would, at best, require consid- 
erable and perhaps tenuous speculation, inas- 
much as diverse features of aggressive behav- 
ior relate to extrapunitiveness. This failure to 
find a cluster of conceptually meaningful re- 
lationships may be the fault of the study, but 
we would suggest that expecting such a clus- 
ter may not be reasonable, which brings us 
to the alternative interpretation. 

Suppose, for instance, that there are two 
Ss with the same high E P-F scores for whom 
extensive personality evaluations, presumed 
valid, are available. One of these Ss gener- 
ally expresses his aggression only verbally. 
He is sarcastic, bitter, and superior but never 
attacks anyone directly and has no history of 
overt physical aggressive behavior. For the 
other individual the high E score is but one 
manifestation of generally very aggressive be- 
havior expressed along the entire spectrum of 
this attribute. There would probably be none 
or a negative relationship between the P-F 
and measures of overt ‘aggression for the first 
S and a positive relationship with almost any 
measure of aggression for the second S. This 
illustrates how the same form and intensity 
of aggression in a particular situation (e.g., 
high E on the P-F test) may reflect different 
types and degrees of aggressivity for different 


individuals. To be sure, aggression shown on 
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one techniquz, such as the P-F, must be re- 
flected in some other measures of that at- 
tribute if the technique is valid; however, 
which measures will be found to correlate 
would, from this point of view, be a prob- 
lem of sampling. That is, positive correla- 
tions can be expected to the degree to which 
the personality characteristics of different Ss 
overlap in the expression of aggression. Since 
even elaborate matching procedures are un- 
likely to yield identical samples of Ss, differ- 
ent samples can be expected to show different 
interrelationships between a particular meas- 
ure, such as the P-F, and other measures de- 
signed to reflect a range of that attribute 
(e.g., Rorschach, attitude scale, other meas- 
ures of aggression). 

Nevertheless, while two samples would 
rarely yield exactly the same interrelation- 
ships, it is possible that given a large num- 
ber of succesive samples tested with the same 
battery, the test measure (e.g., the P-F) will 
be found to correlate more often with some 
criterion measures than with others. Thus, 
while the pattern of intercorrelations may not 
make too much sense for any particular sam- 
ple (as in the present study), it is possible 
that a conceptually meaningful pattern of 
test-criteria relations may emerge through 
frequency counts of such relations across a 
large number of samples. Such a procedure 
might help to establish the specific construct 
validity of the P-F by indicating which as- 
pects of aggression it reflects more often than 
others. The present study is concerned with 
the much more modest task of investigating 
whether the P-F relates to any aspect of ag- 
gression. For this purpose it would seem a 
reasonable test of validity to require that, 
given a pool of criterion measures (e.g., atti- 
tude scale, Rorschach, etc.), the number of 
positive relations to the measure tested (P-F) 
exceed chance expectations. 


SUMMARY 


Extrapunitiveness in the Picture-Frustra- 
tion Study and an intensity of extrapunitive- 
ness measure developed by the authors were 
related to 22 measures of aggression derived 
from the Rorschach, an attitude scale, a psy- 
chiatric interview schedule, and case history 
data. The Ss in this study were 121 inmates 


J. Kaswan, M.Wasman, and L. Z. Freedman 
, > 


of a state prison selected so as to match the 
entire prison population. Of a total of 44 chi 
square comparisons between the Picture- 
Frustration Study and these measures, 10 
were significant at better than the .05 level 
of confidence, and another 10 between the .05 
and .10 level. These results were interpreted 
as indicating that the P-F has some relation 
to other measures of aggression. The P-F was 
not found to tap any particular level or as- 
pect of aggression, and it was suggested that 
repeated cross-validation would be required 
before possible crit 
tablished. 


eria validity could be es- 
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A FACTOR ANALYSIS OF GOUGH’S CALIFORNIA 
PSYCHOLOGICAL INVENTORY’ 


JAMES V. MITCHELL, JR 


Universit) 


The California Psychological Inventory 
(CPI) developed by Gough (1957) during 
the past decade is a structured verbal inven- 
tory designed to measure such personality 
characteristics as dominance, re- 
sponsibility, etc. These are 
thought to be more important for compre- 
hending normal behavior than are the traits 
sampled the MMPI, 
which appear to be most in describing 
psychopathological conditions and maladjust- 
ment syndromes Sloan & Pierce-Jones, 
1958). The 18 scales of the CPI are designed 
to yield a presumably meaningful set of 


whi 


tolerance, 
characteristics 


1: 
i 


] 


] 


by instruments ce 


useful 


(cf. 


scores ch can provide a profile represent- 
ing the personality pattern of an individual. 
Shaffer (1959) has written that “the CPI ap- 
pears to be a major achievement.” Cronbach 
(1959), while frankly 
atheoretical approach to personality measure- 


eschewing Gough’s 
ment, has applauded Gough’s apparent tech- 
Thorndike (1959), 
however, has asserted that the scales of the 
CPI 
cient, and confused picture 
sonalities. Of the 18 
only 4 that fail to correlate at 
some other scale.” 


nical psychometric skill. 


. provide a very redundant, ineffi- 
individual per- 
there are 

50 with 


ol 
a1 
SCaies, 


least 


PROBLEM 


The present investigation was motivated, 
in part, by our difficulty in comprehending 
individual CPI profiles, a difficulty resulting 
from the conditions so clearly indicated by 
Thorndike’s comments mentioned above. We 


were also puzzled by Gough’s grouping of 


1 This investigation w conducted 
from the Mental Health in Teacher Education Proj- 
at the University of Texas, a project which is 
supported by a g1 from the National Institute of 


Mental Health 


with support 


ect 
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of Texas 


the 18 CPI scales into several named clusters 
shown in the Manual (Gough, 1957), since 
there appeared no clear empirical basis for 
such clusters in the correlation matrices pro- 
vided. In consequence, the factor analytic re- 
search reported in this paper was undertaken 
to obtain the kind evidence that would 
help to shed light on the empirical justifica- 
tion of the scales and scale groupings offered 
by Gough. 


ot 


PROCEDURE 


58 cases was employed in this investi 
213 females and 45 
Study in teacher 
generally had 
y subjects 


A total of 2 
This 
who 


included 
were enrolled 
The 


ademic 


zation sample 


ior 


IT a 


training curriculum subjects (Ss 


major fields of work in universit 
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A correlation matrix 
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plication of Humphrey 
Burt’s empirical formula 
that factorization should 
tion of the fourth factor. Orthogonal rotations were 
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ix technique, the characteristics of have 
en discussed in 1 Comrey 57a 
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RESULTS 
rotated factor matrix is shown in 
The four factors accounted for 26%, 
©, and 12% the total variance, 


The 
Table 


15%, 
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2A one-page table of the CPI scale intercorrela- 
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TABLE 1 


ROTATED Factor MATRIX FOR THE 
CPI Scales 


Class I Measures of Poise, Ascendancy, and 
Self-Assurance 


1. Dominance 

2. Capacity for Status 
3. Sociability 

4. Social Presence 

5. Self-Acceptance 

6. Sense of Well-Being 


Class II Measures of Socialization, Maturity 
and Responsibility 


7. Responsibility 
8. Socialization 
9. Self-Control 
10. Tolerance 
11. Good Impression 
12. Communality 
Class III Measures of Achievement Poten 
and Intellectual Efficiency 
13. Achievement by Conformance 
14. Achievement by Independence 
15. Intellectual Efficiency 
Class IV Measures of Intellect 
Interest Modes 


ual and 


16. Psyc hol »wical Mindedness 
17. Flexibility 
18. Femininity 

Per entage of Total Variancs 


De 


respectively. Analysis of the factor loadings 
resulted in the following factor descriptions. 

Factor I, probably the most important fac- 
tor of the present analysis, had its highest 
loadings for the CPI scales named Self Con- 
trol, Good Impression, Achievement via Con- 


formance, Sense of Well Being, Tolerance, 


and Responsibility. Provisionally, it appears 
that this factor might well be named Adjust- 
ment by Social Conformity. Factor II had 
five relatively high loadings, ranging from 


.59 to .78. The CPI scales named Dominance, 
Capacity for Status, Sociability, Social Pres- 
ence, and Self-Acceptance were the important 
ones involved in this factor, suggesting that 
it should be named Social Poise or, alterna- 
tively, Extroversion. Factor III was not as 
well defined as the previous factors, but it 
had loadings above .40 for the CPI scales 


CALIFORNIA Ps' 


labeled Responsibility, Communality, Sociali- 
zation, and Femininity. Gough’s (1957) 


nal descriptions of 


origi- 
characteristics associated 
with these scales strongly suggest the serious, 
responsible, conscientious person, so we have 
tentatively chosen to name this factor Super- 
Ego Strength. Factor IV is possibly the most 
interesting of the identified in the 
present analysis. Having loadings of .5 

higher for CPI scales called Tolerance, Intel- 
lectual Efficiency, Capacity for Status, Flexi- 
bility, Social Presence, and Achievement via 
Independence, this factor suggests a complex 
of qualities which might augur well for suc- 
cess in a wide human activities. 
Common to all of these qualities is an em- 
phasis on intellectuality, broad interests and 
perspectives, and thoroughgoing independ- 
ence. Our present disposition is to name this 


factors 


range of 
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factor Capacity for 
and Action. 


Independent Thought 


DISCUSSION 


Gough (1957) has classified the 
scales into the following four groups: 


18 CPI 


Class I: Measures of Poise, Asc« 
Assurance 

Class Il: Measu e. 
Responsibility 

Class III: Measures of 
Intellectual Efficiency 

Class IV: Measures of Intellectual and 
Modes 


ndancy, and Self 


cialization, Maturity, and 
Achievement Potential and 
Interest 
The factor analytic results reported in this 
paper lend support to four CPI scale classes 


of somewhat different 
Gough’s 


from 
as can be inferred from the results 
and factor descriptions already presented in 
Table 1. It is interesting to notice, for ex- 
ample, that our Factor II, Social Poise or 
Extroversion, encompassed 


composition 


five of the six 
scales included in Gough's Class I. If Gough’s 
four classes of scales are viewed as hypothe- 
ses concerning factor 
appears from 


structure, his first class 


our analysis to be a reason- 
able one, except that we would be forced 
to exclude the Well-Being from the 
class. Our Factor III, here called Super-Ego 
Strength, identified three of Gough’s Class II 
measures and Femininity, from 
Gough’s Class IV. Factor I identified in the 
present analysis located five Class II scales, 
three Class III scales, and scale from 
each of Classes I and IV when loadings of 
43 or higher were considered. Our Factor IV, 
Capacity for Independent Thought and Ac- 
tion, had two substantial loadings for scales 
included by Gough in his Class I 
ing representing a Class II measure, two 
loadings for scales included in Class III, and 
one loading representing a measure included 
by Gough in his Class IV. If the results ob- 
tained in the present analysis are valid, it 
seems clear that Gough’s Class IV is 
ceptable as a cluster of related scales, since 
its constituent scales load on three separate 
factors. Similarly, Class III as defined by 
Gough does not seem to be empirically justi- 
fied. Finally, Gough’s Class II appears to be 
a mixture of scales which, in the present 
analysis, have their loadings on Factors I 


scale 


one scale. 


one 


one load- 


unat 


and III. If the CPI continues to be scored 
for 18 scales, and if a defensible means for 
grouping the scales into classes is desired, 
classification should probably be based upon 
factor analytic findings such as those pre- 
sented in this report. 

It seems quite apparent from the results of 
this research that the CPI cannot be re- 
garded with real justification as measuring 
the 18 relatively independent personality di- 
mensions that it is purported to measure. It 
is also true, judging by our results, that in- 
dividual personality profiles might well be 
based on only a few selected CPI scales. Per- 
haps, for example, the Self-Control scale can 
be regarded as virtually a pure measure of 
Factor I, Adjustment by Social Conformity, 
since it loaded .92 on that factor and was 
essentially independent of the three other 
factors. The Dominance, Sociability, and Self- 
Acceptance scales seem equally good meas- 
ures of Factor II, Social Poise, and they lead 
to no appreciable inferences regarding Fac- 
tors I, III, and IV. Factor III, Super-Ego 
Strength, is well estimated by the Commu- 
nality and Femininity scales, which appear 
to be independent of the three remaining fac- 
tors. The Flexibility scale occupies a similar 
position with regard to Factor IV, Capacity 
for Independent Thought and Action. In each 
case the decision to use a scale or scales to 
represent a factor would naturally be con- 
tingent upon reliability data as well as fac- 
torial composition. 

It is interesting to note that with the ex- 
ception of the Communality and Sociability 
scales, those scales which seem best to repre 
sent the four factors identified in the pres 


ent analysis lead to personality descriptions 
couched in long-used psychological terms. In 
this regard, it can be noted that Cronbach 
(1959) has argued for such descriptions over 


others couched in terms like “social pres- 
ence,” which apparently refer to social be- 
havior patterns often regarded as compli- 
cated resultants of motivational dispositions, 
abilities and skills, and situation-linked stimu- 
lus conditions. 

While the present analysis may have pro- 
vided some clarification of personality meas- 
urement using the CPI, it should be acknowl- 
edged that one limitation of this study exists 
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because factor identification and description 
are partially dependent upon Gough’s origi- 
nal selections of items for scales and upon 
the verbal descriptions assigned to the scales. 
But this fact in no way negates our earlier 
conclusion that the 18 CPI scales represent 
a much smaller number of personality dimen- 
sions. Nor, indeed, does the fact that our 
factor interpretations made use of the origi- 
nal scale descriptions lessen the significance 
of these interpretations, since a careful re- 
view of the items included in the factors 
tends to support the designations which we 
have assigned to each. 


SUMMARY 


A centroid factor analysis was carried out 
with the 18 scales of the California Psycho- 
logical Inventory. The sample employed con- 
sisted of 258 students enrolled in an intro- 
ductory course in educational psychology at 
the University of Texas. Four factors were 
extracted and rotated by Kaiser’s Varimax 
method. The factors identified were tenta 
tively named: I. Adjustment by Social Con- 
formity, II. Social Poise or Extroversion, III. 
Super-Ego Strength, and IV. Capacity for In- 
dependent Thought and Action. It was sug- 
gested that individual personality profiles 
based on the CPI might use only those scales 
best representing the four factors identified 
in this research. Such a limitation would pro- 
duce profiles permitting personality descrip- 
tions to be made in such conventional psy- 
chological terms as “dominance,” ‘“self-ac- 
ceptance,” and the like, rather than in such 


James V. Mitchell, Jr. 


and John Pierce-Jones 


complex social behavioral terms as, for ex 
ample, 


status.” 


“social presence” and “capacity for 
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THE DEVELOPMENT OF AN AFFECT ADJECTIVE CHECK 
LIST FOR THE MEASUREMENT OF ANXIETY 


MARVIN ZUCKERMAN 


i Psychiatr 


Most personality tests are designed to meas- 
ure relatively 
items asking 
“usually,” 
referent 


stable traits. Testees respond to 
how they “generally,” 
“seldom,” etc. 


“often,” 
behave. The time 
is vague and subjects (Ss) may in- 
terpret the questions as referring to the last 
week, month, entire lifetime. 
Ma , of the traits in which clinicians are in 
terested, particularly such as 
assumed to show 
variation from hour to 
day. While the concept 
general level of anxiety” may be useful 
for gross discrimination, there are many oc- 
casions where would like to measure 
changes in anxiety over shorter periods of 
time. Experiments where an attempt is made 
to induce anxiety, or experiments on the ef- 
fect of the “tranquilizer” drugs are examples 
of research where temporal ambiguity of the 
usual anxiety questionnaires might make them 
insensitive to change. Experiments of this kind 
conducted at the Institute of Psychiatric Re- 


year, or their 
affective traits 
hostility or anxiety, are 
large intra-individual 
hour and from day to 
of a ‘ 


one 


‘ 
search led the author to the development of 
a test which could be 


scored 
for varying time 


given quickly, 
objectively, and adapted 
sets.” 

The adjective check list 
ideally suited 
above. Gough (1955) 
test 


personality 


method seemed 


purposes discussed 
a check list 


cales of various 


for the 
used 
for deve loping empirical 
Nowlis (1 
a check list in 


duced by drugs given to college students. 


has 
traits 153) reported the 
use of measuring changes in 

The purpose of the present study was to 
develop empirically a scoring key for “anx- 
iety”’ using a pool of adjectives with various 

Now at Departn 
( ollege 

2 The idea for this test rged out of a discussion 
with John I. Nurnberger on this problem 


Psychology, Brooklyn 


Research, I 


affective connotations, and to test the 
ability and validity of this anxiety score. 


reli- 


EMPIRICAL DEVELOPMENT OF A SCORING KEY 
Adjectives with affective connotations were 
collected from Gough’s and Nowlis’ lists and 
from a thesaurus. Adjectives which were of 
low frequency in the written language were 
excluded so that Ss of less than average in- 
telligence could understand the items. The 
final list of adjectives consists of 61 items.* 
The scoring key was derived from item 
analyses in two studies. The first study by 
Persky, Maroc, and Breeijen 
(1959) compared a group of psychiatric pa- 
tients rated high in anxiety with a group of 
normal controls rated low in anxiety on the 


Conrad, den 


basis of a psychiatric interview. A prelimi- 
nary scoring key of 30 words was used, based 
on an a priori selection of words with “anx- 
iety” connotations. This score yielded a highly 
An 
comparing 
the frequencies in each group checking, or not 


significant difference between the groups 
item analysis was performed by 


checking, each of the adjectives. Twenty-four 
of the words yelded significant differences 
(p < .05). Twelve of these were “anxiety- 
plus” words (checked more frequently by the 
High Anxiety group) and 12 were “anxiety- 
minus” words (checked more frequently by 
the Low Anxiety group). Although the two 
groups in this study were not ideally matched 
on nonanxiety variables it is interesting that 
all but one of the anxiety-plus words had been 
previously selected as anxiety words on the 
basis of their connotation and were 
those used in the a priori scoring. 
The second study by Levitt, den Breeijen, 
and Persky (in press) measured the effects of 


among 


>A copy of the check list may be secured by writ 


ing to the author 
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a hypnotically induced anxiety state in nor- 
mals. A score on the Affect Adjective Check 
List (AACL) based on the item analysis in 
the preceding study showed a highly signifi- 
cant rise during the anxiety condition. An 
item analysis yielded 38 words showing sig- 
nificant changes in checking frequency dur- 
ing the anxiety condition. Twenty-one were 
anxiety-plus words (increase in checking fre- 
quency during anxiety condition) and 17 
were anxiety-minus words (decrease in check- 
ing frequency during anxiety condition). 

The final scoring key includes 21 words 
which proved to be significantly related to 
anxiety in both of the aforementioned studies. 
These words are listed below. 


frightened, 
terrified, upset, 


Anxiety-plus: afraid, desperate, fearful 
nervous, panicky, shaky, tense, 
worrying. 

Anxiety-minus: calm, cheerful, contented, happy, 
joyful, loving, pleasant, secure, steady, thoughtful 

Anxiety-plus words are scored 1 if checked, 
and anxiety-minus words are scored 1 if not 
checked. The possible range of 
21 


cores 1s 0 to 


RELIABILITY StuDy: COLLEGE Group I 


Two versions of the AACL and the Taylor 
(1953) Manifest Anxiety Scale (MAS) were 
given to 50 students in two sections of an ele- 
mentary psychology class at the Purdue Uni- 
versity Extension School. The group consisted 
of 43 males and 7 females 


. The average age 
(SD = 4.6). The items in the two 
versions of the AACL were the same but one 
test asked the Ss to check words which de- 
scribed how they “generally feel’ while the 
other test asked them to check items describ- 
ing how they felt “today.” On the latter test 
they were told that “today” was defined as 
beginning from the time they awoke that 
morning. One section received the General 
test first and the Today test second while the 
other section took them in the reversed order. 
One week later the Ss took the two AACLs 
again and each section received them in an 
order reversed from the first session. 

Tests for the effect of order on the General 
and Today tests were performed by compar- 
ing each S’s scores on the occasions when he 
took the-tests first and when he took them 
second. The 50 Ss scored an average of .68 


was 22.3 
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point higher on the Today test when it was 
taken before and .32 point higher on the Gen- 
eral test when it followed the Today test, but 
these differences were not significant (t’s 
1.01, .95). One can conclude that order had 
no effect on the two AACL versions. 

Two kinds of reliability were examined for 
the General and Today tests: 
sistency on the first te 
ability from the first 


internal con- 
ting, and retest reli- 
to the second test. The 
General test was expected to show reasonably 
high internal reliability and retest reliability 
since it was assumed that the General time 
set would cause Ss to describe a stable trait. 
The Today test was expected to show high 
internal reliability on 
reliability on retest 


i single testing but low 
Since the Today test is 
designed to measure day to day fluctuations 
it is unlikely that it could be sensitive to these 
fluctuations and still 
to week. 

The two measures of 


remain stable from week 
reliability for each of 
Table 1. Internal reli- 
ability is calculated using the Kuder-Richard 
son Formula 20. It can be seen that the ex- 
pectations of the experimenter (£) are borne 
out. The General test is reliable internally and 
in retest, while the 
reliable on a single 
ability on a retest 

ties bears out the E 
different nature of te 


the tests is given in 


Today test is internally 
testing but low in reli- 
This contrast in reliabili- 
issumptions about the 
ts attempting to meas- 
ure stable and fluctuating traits. The actual 
correlation between the two versions of the 
AACL on the first testing was .43, indicating 
a significant, but only moderate, relationship 
between the two tests 
VaLipity Stupy: CoL__ece Group II 
The purpose of 


Today AACL anxiet 


study was to see if the 
would show an in- 


core 





Measurement of 


crease when this test is given on the day of 
an examination. “Examination Anxiety” is not 
usually as intense as the anxiety seen in clini- 
cal patients and it varies considerably be- 
tween individuals. However, if the AACL 
proved sensitive to Examination Anxiety it 
would probably be sensitive to more intense 
forms of anxiety. 


Method 


The Ss were 35 college st in a s 
mentary psychology females and 2 
were males. The average age was (SD 2.7) 
One student dropped the course and his 
only used on comparisons involving the 

The students were gi he General 
versions of the AACL, the MAS, and the 
Ego-Strength scales (Barron, 1953) on the second 
day of the class. They were given the Today AACL 
at the beginning of every other class meeting except 
those foll meetings. The E 
/ 


tion of ele 
T welve we 
data was 
initial tests 
Today 
Barron 


en tl and 


wing t was inter- 

g the nonexam day AACL scores 
with the exam day AACL scores. The AACL was 
given regularly at the same time so that the students 
would not guess the connection between the AACL 
and their course 
ssed the con- 
seen the purpose 
AACL was not given at the 


alter examinations 


ested in comparing 


examinations. At the end of the 
asked if any f th 
nection and none cl 

of the experiment. The 
three mee 


they were m had gue 


to have 


tings vecause at these 
times the students wert to hear their grades 
and the E wa not sur whether these oc 
hould 


asions 
situations 
xam days and 3 exam 


t on the exam days 


be classifi “anxiety” 
There were a total of 
lays. All 34 student 
Absences varied n tf 
students. No st 

the 13 days tl 


also 


Result 


Figure 1 shows the change in AACL anx- 
iety scores during the 


13 class 
the mean 
score on the other 9 nonexam days was sub- 


course ot the 


meetings. In the case of absent Ss, 


AACL ANKIETY SCORE 


AACL anxiety scores 


class meetings 


during 13 


Anxiety 


TABLE 2 


MeAN DIFFERENCES 
NONEXAM Day 


BETWEEN EXAM 
ANXIETY SCORES 


AND 


Mp 


exam-M 


Exam I 

Exam II 

Exam III 

MV all exam day 
VW all nonexam 


stituted for the missing entries in order to 
calculate comparable group means for the dif 
ferent sessions. The means of the group on 
nonexam days ranged from 6 to 8. The SD of 
the 10 nonexam day means was .77. On the 
three exam days the AACL scores rose to 9 
or 10. Table 2 gives the mean differences be- 
tween: (a) AACL scores on each of the exam 
days and the means of these scores on the 
three or four nonexam days immediately pre- 
ceding each exam, (b) the mean anxiety score 
on all three exam days and the mean score on 
all 10 nonexam days. The AACL increases are 
significant for each of the exam days and the 
mean for all exam days is significantly higher 
than the mean for all nonexam days. The hy- 
pothesis that pre-examination anxiety will re- 
sult in increased anxiety scores on the AACL 
is supported by these results. 

After these results had been obtained, an- 
other hypothesis occurred to the E. It is pos- 
sible that the amount of anxiety elicited by 
an examination might be related to how well 
one expects to do on the examination. Al- 
though no measures had made of the 
Ss’ anticipations prior to the examination 
their actual performance on the examination 


been 


might be some measure of their preparation 
and consequent anticipation. The class was 
divided into thirds on the basis of their grades 
on each of the examinations. The mean 
changes on the AACL anxiety score of the 
High and Low Grade groups were then com- 
pared. The results of these tests are contained 
in Table 3. It can be seen that on all three 
examinations the Ss who made high grades 
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TABLE 3 


COMPARISONS OF MEAN DIFFERENCES BETWEEN EXAM 
AND NONEXAM Day ANXIETY SCORES IN HIGH 
AND Low GRADE GRoUPS 


® Differences in 
cut-off point. 
One-tailed test. 


showed less of an increase in anxiety scores 
than the Ss who made low grades. The dif- 
ferences are significant for the first two ex- 
aminations but not significant for the third 
examination. The class was then divided into 
thirds on the basis of their total percentage 
score for the entire term. The High and Low 
Grade students were compared on the differ- 
ences between the mean for all exam days 
and the mean for all nonexam days. This re- 
sult is also contained in Table 3. The differ- 
ence is significant at the .05 level. The cor- 
relation between the total percentage grade 
for the term and the mean exam minus non- 
exam day differences for the total group of 34 
was —.34, which was significant below the .05 
level. The results generally support the hy- 
pothesis that the students doing well on ex- 
aminations would show less increase in anx- 
iety scores than the students doing poorly on 
examinations. However, these results do not 
‘definitively support the idea that the student’s 
anticipation of success or failure is responsible 
for this difference. An alternate hypothesis 
might be that the pre-examination anxiety 
actually caused the poorer performance on 
the examination. In the next study of this 
question, the E hopes to obtain anticipations 
before the examination, on the hypothesis 
that these will be more strongly related to the 
anxiety scores than the actual grades on the 
examinations. 


VALIDITY: RELATIONSHIPS WITH OTHER 
Trests—Grovups I AND II, AND 
PREGNANT WOMEN GROUP 


Since it has been hypothesized that most 
questionnaires give the S a general time set it 


was expected that the MAS and other tests 
would correlate significantly with the General 
version of the AACL but not with the Today 
version given on one particular day. However, 
it was expected that the MAS would correlate 
significantly with the mean of a number of 
AACL anxiety measures since this would pre- 
sumably be a better measure of the general 
anxiety level than a score on a single day. 
The MAS should correlate significantly with 
the Month version of the AACL given to the 
pregnant women since a month is a 
enough period to consider a general time set. 


long 


Method 


In each of the ge groups the MAS 
was given to the Ss. Ir ge Group II the Ss also 
took the Barron Eg ngth scale. A third group 

f Ss was collected in anot tudy on pregnancy 
involved mont interviews and tests 
with a group of 51 preg t women. The women 
were mainly from a | opulation. The av 
erage age was 21.1 ) 5 The av 
tion in years was 11.3 


his study 


rage educa 
The group con 
1f 41 Negro and 11 whit s. The AACL used 


in this study asked tt how they 


sisted 
had been feeling 
during the past month. At the time of the first inter- 
view, the Ss were given the Month AACL, MAS, the 
MMPI M.-F scale, and thr scales from the Paren 
tal Attitude R I iY & Bell 
1958), the latter nstituting a factor called “Hos 
tility-Rejection’ ickermar yback, Monashkin, 
& Norton, 1958 interview the 
again given t AA und the MAS. The 
number of interviews ged from 1 to 6. A sub 
group of 10 Ss ha AACL and the MAS 


(Sch re fe r 


Ss were 


during the sixth, sevent ( th, and ninth months 
of pregnancy 


Results 


The correlations between the MAS and the 
various versions of the AACL given to the 
three groups of Ss are presented in Table 4. 
In the College Group I the MAS correlated 
significantly with the Today AACL (r = .29) 
and did not correlate with the 
General AACL (7 direction of 
the difference in tl contrary to 
expectation but the difference between 
correlations was not significant (¢ = .78). In 
the College Group II the MAS correlated sig- 
nificantly with the General (r 58) and did 
not correlate significantly with the Today 
AACL (r= .32), according to expectation. 


4 This study is sti 


significantly 
The 
oup wa 


these 
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The correlation between the MAS and Gen- 
eral AACL was significantly higher than the 
correlation between the MAS and Today 
AACL (#= 2.07, p< .05). The differences 
in results between these two similar samples 
are difficult to explain. The MAS correlated 
significantly with the mean of 10 Today 
AACL scores (r = .52), as expected. It did 
not correlate significantly with the mean of 
the AACL scores on the three examination 
days (r = .28). The difference between these 
latter two correlations significant 
(¢ = .63). However, the correlation between 
MAS and AACL on the first examination day 
was .40 which was significant below the .05 
level. The correlation with AACL on the sec- 
ond exam day dropped to .09. The correla- 
tion for the third exam day was .19. Both of 
the latter correlations significant 
The differences correlation on 
the first exam day and the correlations on the 


was not 


were not 
between the 
econd and third exam days were not signifi 
cant (#’s = 1.75, 1.26). 

The correlations of MAS with AACL for 
the group of pregnant women are also con- 
tained in Table 4 between 
the tests on the first occasion when they were 
both given (.V 


‘he correlation 


significant be- 
more General time set 


was .65, 


51) 
low the .001 level. The 
of a Month resulted in a significant correla- 


tion with MAS, as expected. The relationship 


rABLI 


Month 6 
Month 7 
Month 8 
Month 9 
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with MAS as a function of the particular 
month of pregnancy was measured in the 10 
Ss who had taken both tests during the sixth 
to ninth months. The average MAS scores 
showed a significant drop during this period 
while the AACL anxiety scores showed no 
trend. It can be seen that the correlations be- 
tween the two tests are very high during the 
sixth, seventh, and eighth months but drop to 
a nonsignificant .38 during the ninth month. 
The difference between the correlation of the 
two tests in the eighth month and in the ninth 
month was not significant (¢ = 1.77). 

In the College Group II, the Barron Ego- 
Strength scale correlated significantly with the 
General AACL score (r 37), 
significantly 


(—.28) 


and non- 
with the mean for exam days 
and the mean for nonexam days 
(—.34). The mean AACL scores in the Preg- 
nant Group did not correlate significantly 
with the parental attitude scales (Marital 
Conflict, r = .14; Irritability, r= .02; Re- 
jection of the Homemaking Role, r = .21), 
and it did not correlate significantly with the 
M-F scale (—.14), or with a score based on 
physical complaints in the interview (.20). 
The failure of the AACL to correlate with 
these measures is not crucial for the validity 
of the test, for these tests are not intended as 
direct measures of anxiety. 


AACL AND SEx, AGE, AND EDUCATION 


rhe total college group used in the first 
two studies consisted of 65 males and 19 fe- 
males. The differences between the means for 
males and females on the General and Today 
versions of the AACL were compared and 
found to be small and insignificant (General, 
t = .09; Today, t = .49). In the same group 
of 84 Ss, age was not found to correlate with 
the General AACL (r = .08) or with the To- 
day AACL (r 01). In the group of 51 
pregnant women, the Month AACL average 
did not correlate significantly with age (r 
-.12) or with education as measured by 
years of school (7 07). 


NoRMS 


Although the 
enough to 


samples used are not large 
warrant establishing standard 
scores, the reader may be interested in com- 
paring their samples with the ones used in 
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this study. The means and standard devia- 
tions for the combined college group of 84 
Ss were: Today, M = 7.38, SD = 3.98; Gen- 
eral, M = 5.15, SD = 3.06. For the group of 
51 pregnant women given the Month version 
of the AACL for the first time, M = 7.14, 
SD = 3.74. In using these normative figures 
one should consider the characteristics of the 
samples as described in the article. 


SUMMARY AND CONCLUSIONS 


A list of affectively toned adjectives was 
used in the development of an anxiety scor- 
ing key. Adjectives which differentiated High 
and Low Anxiety.groups (selected by psy- 
chiatric interview), and showed significant 
changes in checking frequency during a hyp- 
notically suggested anxiety state, were used in 
the key. Two forms of the test with different 
time sets were used in a reliability study 
Both the “General” and “Today” forms had 
adequate internal reliability on a single oc 
casion, but only the General form demon 
strated marked retest reliability. These re- 
sults were expected because of the differences 
in the time set. The Today version of the 
AACL was given repeatedly to a second 
group of college students at the beginning of 
a class period. Anxiety scores on examination 
days were compared with scores on nonex- 
amination days. The AACL score was found 
to rise significantly on examination days. On 
two of the three examinations, the students 
who made good grades on the examinations 
showed a significantly smaller rise in anx- 
iety scores than the students who made poor 
grades on the examinations. The General 
AACL correlated significantly with the MAS 
in one of the two college samples. The mean 
of 10 nonexamination day AACLs correlated 
significantly with the MAS. The mean of the 
3 examination day AACLs did not correlate 
significantly with the MAS, although the score 
on the first examination day did correlate sig- 
nificantly with the MAS. AACL and MAS 
were highly correlated in a group of pregnant 
women. 
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The General version of the AACL is sug- 
gested as a quick measure of general anxiety 
level. The Today version of the test is sug- 
gested as a technique in studies where re- 
peated assessments of anxiety must be made 
within a relatively limited time interval, i.e., 
weeks or months. The set of the AACL 
can be changed by a simple adjustment of the 


time 


instructions. In the study of pregnant women 
a “month” set was used. In an experiment 
taking place over an interval of a few hours 
a “now” or “this minute” set could be used. 
It is assumed that the anxiety scoring key 
will be valid regardless of the time set. The 
studies reported give some evidence to sup- 
port this assumption although only 
work can 


future 
establish the fact. Other experi- 
menters may desire to develop new keys for 
this test keys 


are two obvious po ihilities 


‘Depression” and “Hostility 
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AN EXTENSION OF THE CONSTRUCT VALIDITY OF 
THE EGO STRENGTH SCALE 


BENJAMIN KLEINMUNTZ 


Carnegie Institute 


The present add 
strands to the network - (Cron- 
bach & Meehl, 1955) which comprise the 
Ego Strength (Es) scale (Barron, 1953). It 
was reasoned that if the Es scale can dif- 
ferentiate between psychiatric and nonpsy- 
chiatric (Gottesman, 1959) 
really measures ego strength, 


tudy was designed to 


nomological 


adults and if it 
then adjusted 
7 


college students should get significantly higher 


scores on this scale than do maladjusted col- 
students. Furthermore, if the 


scale 


one of 
about the E 

i defensive test taking atti- 
then it would seem that our 


lege 
things which may be said 
is that it reflects 
tude sample of 
adjusted students’ defenses would be higher 
than those of the idjusted students 

A random sample of 5 


’ 
ma 


drawn from 
Teach- 
at the University of 


(27 males and 23 females) was 
a group of 300 sophomore and junior 
ers College candidates 
Nebraska. MMPIs 
these students as 

and it was ge! 
sults had some bearing on their futures in the 
Teachers College. The criterion of adjustment 
was met if these students the 
Cornell Medical Index that they had never 
been under psychiatric treatment and if they 
had Health Di 

vision The 


had been administered to 
a routine screening device, 


erally known that the test re- 


indicated on 


never. contact the Mental 


1 


at the University of Nebraska 


of Techn 


maladjusted students consisted of 
33 individuals (24 males and 9 females) who 
had voluntarily sought psychiatric consulta 
tion at the Mental Health Division and who 
had remained in treatment for five or more 
interviews. It is possible of course, with such 
criteria of adjustment maladjustment, 
that a number of really well adjusted indi 
viduals might be included in the maladjusted 
group and that conversely a number of 
adjusted individuals might be included in the 
adjusted group. If error here, 
is in the direction of mak 
the Es scale to dis 
criminate between the two groups 

The MMPI records of the two 
rescored for Es and K and it was found that 
both I 


between 


group of 


and 


mal- 


there is an 
however, the error 
ing it more difficult for 


groups were 


scales do tend to roadly discriminate 

adjusted and maladjusted 

students. These results appear in Table 1. 
The mean differences betweeh the two 


| 


colleg - 


two scales 
. : 
nificant beyond the 1 level 


groups on € were Sig 
of confidence 

the adjusted 
than that of the 
mean K scale 
tudents was also much 
» maladjusted. Correla 
with high K) and .4§ 
between Es and K for 


The mean 
students was much 
maladjusted 


students: and the 


score of the adjusted 
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adjusted and maladjusted students, respec- 
tively, lend additional evidence that Es is in 
some measure an index of defensiveness. 
The obtained mean differences in K and 
Es scores and the magnitude of the correla- 
tion coefficients are consistent with the find- 
ings of Gottesman (1959). In the light of 


these findings it may be suggested that with 
a college population Es scores could be used 
in profile analysis to supplement information 
provided by the other scales of the MMPI. 
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tions designed to determine if they recognized 
that they had been reinforced. 

The results are consistent with the findings of 
Wickes (1956) and of Gross (1959). Verbal re- 
inforcement influenced the rate of the reinforced 
class of responses in the predicted direction for 
each group. On the basis of a one-tailed analysis 
the differences are significant at better than the 
05 level of confidence 

“Awareness” was defined as a verbal report 
concerning the nature of the reinforcing stimulus 
Further the S was asked if he believed the E was 
attempting to influence his productions. No S re- 
ported that he had been “aware,” and only two 
stated that the E had verbalized “mmm-hmn 
It would seem that the Ss were not simply co- 
operating but were responding to the effect of 
the reinforcing procedure 

On the basis of the above reported research 
and other similar studies, it is possible for the 
examiner, because of some “unconscious” bias 
to influence test protocols. At the very least 
this is an argument for highly controlled test 
conditions 
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Garpner, E. F., & THompson, G. G. Syracuse Scales 
of Social Relations. Elementary Level, Grades 5 
and 6, Junior and Senior high school level. Test 
booklets, pkg. of 35 ($4.90) includes scoring guide, 
report folder, tally sheets, class record, and manual, 
24 pp. Yonkers-on-Hudson, N. Y.: World Book 


Company, 1959. 


This sociometric device makes use of two hypo 
thetical situations as a basis for ratings by each stu- 
dent of his classmates. Approximately parallel forms 
are developed for use with elementary, junior, and 
senior high school pupils. One of the situations, at 
all three levels, involves rating others’ ability to offer 
support, comfort, and sympathy, and is intended to 
reflect need for succorance; the other is specific to 
the level. At the elementary level achievement-recog- 
nition is tapped; at Junior high, deference, at Senior 
high, playmirth. The student’s frame of reference for 
his ratings is established in a forced distribution, 
using individuals from all the persons he 
knows as reference points. Since every pupil is evalu- 


selected 


ated by every other one, information becomes avail 
able on: (a) how each pupil views his classmates as 
being able to satisfy two of his important psycho- 
logical needs; (b) how each pupil is evaluated by 
his classmates as being able 
Large samples, unspecified 


to satisfy their needs 
with regard to such rele- 
vant factors as intelligence and socioeconomic status, 
provide a not fully satisfactory normative basis for 
interpreting the average ratings 
for each need. 

An instrument like this one departs from the pat 
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given and received 


Evalua- 
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not susceptible to simple summary. Brief reference 
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In this pioneering work three psycholo- 
gists investigate the process of clinical 
inference. Succinctly stated, the cen- 
tral problem of the book is: How does 
the behavior analyst proceed from raw 
data to refined inference? How does 
he construct a diagnosis, form an as- 
sessment, or create a description of 
another person? The original theory 
of cognition developed here adds a new 
dimension to the understanding of how 


we acquire and use information about 
other people. 

The authors’ formulation of a theory 
of the inference process was motivated 
by findings that statistical predictions 
of behavior are superior to “intuitive” 
The work de- 
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or clinical predictions. 


conclusion that both clinical and statis- 
tical predictions are adaptations of 
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